Reputation: 2044
I am trying to build a very simple multilayer perceptron (MLP) in keras
:
model = Sequential()
model.add(Dense(16, 8, init='uniform', activation='tanh'))
model.add(Dropout(0.5))
model.add(Dense(8, 2, init='uniform', activation='tanh'))
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer=sgd)
model.fit(X_train, y_train, nb_epoch=1000, batch_size=50)
score = model.evaluate(X_test, y_test, batch_size=50)
My training data shape: X_train.shape
gives (34180, 16)
The labels belong to binary class with shape: y_train.shape
gives (34180,)
So my keras
code should produce the network with following connection: 16x8 => 8x2
which produces the shape mismatch error:
ValueError: Input dimension mis-match. (input[0].shape[1] = 2, input[1].shape[1] = 1)
Apply node that caused the error: Elemwise{sub,no_inplace}(Elemwise{Composite{tanh((i0 + i1))}}[(0, 0)].0, <TensorType(float64, matrix)>)
Inputs types: [TensorType(float64, matrix), TensorType(float64, matrix)]
Inputs shapes: [(50, 2), (50, 1)]
Inputs strides: [(16, 8), (8, 8)]
At Epoch 0
at line model.fit(X_train, y_train, nb_epoch=1000, batch_size=50)
. Am I overseeing something obvious in Keras?
EDIT: I have gone through the question here but does not solve my problem
Upvotes: 3
Views: 7310
Reputation: 894
I had the same problem and then found this thread;
https://github.com/fchollet/keras/issues/68
It appears for you to state a final output layer of 2 or for any number of categories the labels need to be of a categorical type where essentially this is a binary vector for each observation e.g a 3 class output vector [0,2,1,0,1,0] becomes [[1,0,0],[0,0,1],[0,1,0],[1,0,0],[0,1,0],[1,0,0]].
The np_utils.to_categorical function solved this for me;
from keras.utils import np_utils, generic_utils
y_train, y_test = [np_utils.to_categorical(x) for x in (y_train, y_test)]
Upvotes: 10