Reputation: 43
I have a sequential network that takes in vectored sentences of length 20 words and aims to classify the sentence based on a label. Each word has 300 dimensions. Therefore each sentence has a shape (20, 300). The dataset has 11 samples currently therefore the full x_train is of shape (11, 20, 300)
Below is the code for my Network:
nnmodel = keras.Sequential()
nnmodel.add(keras.layers.InputLayer(input_shape = (20, 300)))
nnmodel.add(keras.layers.Dense(units = 300, activation = "relu"))
nnmodel.add(keras.layers.Dense(units = 20, activation = "relu"))
nnmodel.add(keras.layers.Dense(units = 1, activation = "sigmoid"))
nnmodel.compile(optimizer='adam',
loss='SparseCategoricalCrossentropy',
metrics=['accuracy'])
nnmodel.fit(x_train, y_train, epochs=10, batch_size = 1)
for layer in nnmodel.layers:
print(layer.output_shape)
This gives:
Epoch 1/10
11/11 [==============================] - 0s 1ms/step - loss: 2.9727 - accuracy: 0.0455
Epoch 2/10
11/11 [==============================] - 0s 1ms/step - loss: 2.7716 - accuracy: 0.0682
Epoch 3/10
11/11 [==============================] - 0s 1ms/step - loss: 2.6279 - accuracy: 0.0682
Epoch 4/10
11/11 [==============================] - 0s 1ms/step - loss: 2.4878 - accuracy: 0.0682
Epoch 5/10
11/11 [==============================] - 0s 1ms/step - loss: 2.3145 - accuracy: 0.0545
Epoch 6/10
11/11 [==============================] - 0s 1ms/step - loss: 2.0505 - accuracy: 0.0545
Epoch 7/10
11/11 [==============================] - 0s 1ms/step - loss: 1.7010 - accuracy: 0.0545
Epoch 8/10
11/11 [==============================] - 0s 992us/step - loss: 1.2874 - accuracy: 0.0545
Epoch 9/10
11/11 [==============================] - 0s 891us/step - loss: 0.9628 - accuracy: 0.0545
Epoch 10/10
11/11 [==============================] - 0s 794us/step - loss: 0.7960 - accuracy: 0.0545
(None, 20, 300)
(None, 20, 20)
(None, 20, 1)
Why is my output layer returning (20,1)? It needs to be of shape (1) because my label is just an integer. I'm quite confused and unsure how it is calculating the loss too if the shape is wrong.
Any help would be greatly appreciated/ Thanks
Upvotes: 0
Views: 118
Reputation: 375
With the current code, it is the expected output. Adding a simple dense layer for a multidimensional input will only change the size of the last dimension. If you notice, in CNNs, we generally add a Flatten
after the convolution layers for the same reason. A Flatten
layer essentially reshapes the input array to remove extra dimensions (each sample is now 1 dimensional). Updated code should be:
nnmodel = keras.Sequential()
nnmodel.add(keras.layers.InputLayer(input_shape = (20, 300)))
nnmodel.add(keras.layers.Flatten()) #This is the code change
nnmodel.add(keras.layers.Dense(units = 300, activation = "relu"))
nnmodel.add(keras.layers.Dense(units = 20, activation = "relu"))
nnmodel.add(keras.layers.Dense(units = 1, activation = "sigmoid"))
nnmodel.compile(optimizer='adam',
loss='SparseCategoricalCrossentropy',
metrics=['accuracy'])
nnmodel.fit(x_train, y_train, epochs=10, batch_size = 1)
for layer in nnmodel.layers:
print(layer.output_shape)
Upvotes: 1