ingrid
ingrid

Reputation: 107

ValueError: Error when checking target: expected dense_19 to have 3 dimensions, but got array with shape (5, 3)

I have the following code that creates LSTM network using Keras with TensorFlow backend. This code runs well.

import numpy as np
import pandas as pd
from sklearn import model_selection
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout
from keras.layers.recurrent import LSTM
from keras.utils import np_utils

flights = {
            'flight_stage': [1,0,1,1,0,0,1],
            'scheduled_hour': [16,16,17,17,17,18,18],
            'delay_category': [1,0,2,2,1,0,2]
        }

columns = ['flight_stage', 'scheduled_hour', 'delay_category']

df = pd.DataFrame(flights, columns=columns)

X = df.drop('delay_category',1)
y = df['delay_category']

X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y, test_size=0.25, random_state=42)

nb_features = X_train.shape[1]
nb_classes = y.nunique()
hidden_neurons = 32
timestamps = X_train.shape[0]

# Reshape input data to 3D array
X_train = X_train.values.reshape(1, X_train.shape[0], X_train.shape[1])
X_test = X_test.values.reshape(1, X_test.shape[0], X_test.shape[1])

y_train = np_utils.to_categorical(y_train, nb_classes)
y_test = np_utils.to_categorical(y_test, nb_classes)

model = Sequential()
model.add(LSTM(
                units=hidden_neurons, 
                return_sequences=True, 
                input_shape=(timestamps,nb_features)
              )
         )

model.add(Dropout(0.2))

model.add(Dense(activation='softmax', units=nb_classes))

model.compile(loss="categorical_crossentropy",
              optimizer='adadelta')

But when I start training the model, it fails:

history = model.fit(X_train, y_train, validation_split=0.25, epochs=500, batch_size=2, shuffle=True, verbose=0)

Error:

ValueError: Error when checking target: expected dense_19 to have 3 dimensions, but got array with shape (5, 3)

This error refers to the final Dense layer. I used model.summary() to get exact dimensions. The output shape of a Dense layer is (None, 5, 3). However I do not understand why does it have 3 dimensions and what None stands for (how did it appear in this last layer)?

Upvotes: 1

Views: 228

Answers (1)

edkeveked
edkeveked

Reputation: 18371

3 is the number of units returned by the last layer. It is the number of classes for the softmax activation

5 is the number of units returned by the lstm which indicates the size of the sequences returned

None is the number of element by batch for the last layer. It simply means that the last layer can accept different size for each batches of tensor of shape [5, 3]

X_train shape: (1, 5, 2), 
X_test shape: (1, 2, 2), 
y_train shape: (5,3), 
y_test shape: (2,3)

Looking at the data shape, there is clearly a mismatch between the batchsize of the features and that of the labels. The most left number should be equal between the features shape X and the labels shape y. It is the batchsize.

'1', 5, 2 => batch size of 1
'2', 3 => batch size of 2

There is a mismatch here. Also to solve the issue between the output of the lstm layer and the input of the last layer, one can use a layer.flatten

nb_classes = 3
hidden_neurons = 32

model = Sequential()

model.add(LSTM(
                units=hidden_neurons, 
                return_sequences=True, 
                input_shape=(5, 2)
              )
         )

model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(activation='softmax', units=nb_classes))

model.compile(loss="categorical_crossentropy",
              optimizer='adadelta')

model.compile(loss='categorical_crossentropy',
              optimizer='adam')

live code

Upvotes: 1

Related Questions