Reputation: 2147
I have ECG data which are in tensor-form of (samples, timesteps, features) with e.g. (2464,15000,1).
I aim to classify labels in the range 1 to 5. I one hot encoded the target to have the dimensions (2464,5).
Before I start with LSTM I wanted to try a basic approach. With following sequential model:
def build_model():
model = models.Sequential()
model.add(layers.Dense(16, activation='relu',input_shape=(X_train.shape[1],X_train.shape[2])))
model.add(layers.Dense(16, activation='relu'))
model.add(layers.Dense(y_train.shape[1], activation='softmax'))
model.compile(loss='sparse_categorical_crossentropy',
optimizer='rmsprop',
metrics=['mae'])
return model
model=build_model()
history=model.fit(X_train,
y_train,
epochs=20,
batch_size=512,
validation_data=(X_val,y_val))
Unfortunately, I get a value error:
File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\training.py", line 113, in _standardize_input_data 'with shape ' + str(data_shape))
ValueError: Error when checking target: expected dense_13 to have 3 dimensions, but got array with shape (2464, 5)
I searched for other topics with the problem. Most of the case, the output layer does not fit the expected targets, in my case 5. So that should be correct. The other issue was often that the target was not one hot encoded. But that is also done. The error tells me to have three dimensions (based on the input?). But if I add one dimension to gain a target dimension of (2464,5,1), the ValueError change, expecting the same dimension as the input data, while I have to reduce it to the y-target with the softmax layer.
ValueError: Error when checking target: expected dense_23 to have shape (15000, 1) but got array with shape (5, 1)
I get confused. Could you give me a hint?
Extra: I also tried to flatten the input, but get also a (different) ValueError (...shape (1,) but got array with shape (5,)). Is flattening here the correct approach?
Thanks for your help.
Upvotes: 0
Views: 927
Reputation: 2706
The problem here is actually your loss function. Since you already have 1-hot encoded y vectors then you should not be using the sparse categorical cross entropy, but rather you should use categorical cross entropy. Try this
Let's just make some dummy data for dimensions sake.
X_train = np.zeros((2464, 150, 1))
y_train = np.zeros((2464,))
X_val = np.zeros((2464, 150, 1))
y_val = np.zeros((2464,))
X_train_r = X_train.reshape(2464,150,)
X_val_r = X_val.reshape(2464,150,)
input_shape = (150,)
print(X_train_r.shape)
print(X_val_r.shape)
(2464, 150)
(2464, 150)
Now we will get our one hot encoded outputs for our labels.
num_classes = 5
# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_val_binary = keras.utils.to_categorical(y_val, num_classes)
print(y_train_binary.shape)
print(y_val_binary.shape)
(2464, 5)
(2464, 5)
Now we will make our model as follows
model = Sequential()
model.add(Dense(16, activation='relu',input_shape=input_shape))
model.add(Dense(16, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
model.summary()
history=model.fit(X_train_r,
y_train_binary,
epochs=5,
batch_size=8,
validation_data=(X_val_r, y_val_binary))
This will work for you.
If you want to use sparse categorical cross entropy then you should not use 1-hot encoded labels. Like this
X_train = np.zeros((2464, 150, 1))
y_train = np.zeros((2464,))
X_val = np.zeros((2464, 150, 1))
y_val = np.zeros((2464,))
X_train_r = X_train.reshape(2464,150,)
X_val_r = X_val.reshape(2464,150,)
input_shape = (150,)
num_classes = 5
model = Sequential()
model.add(Dense(16, activation='relu',input_shape=input_shape))
model.add(Dense(16, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='sparse_categorical_crossentropy',
optimizer='rmsprop',
metrics=['mae'])
model.summary()
history=model.fit(X_train_r,
y_train,
epochs=5,
batch_size=8,
validation_data=(X_val_r, y_val))
Upvotes: 1