Reputation: 1727
So, my input is of shape (1, 893, 463)
, or, more generally, (None, None, 463)
. This corresponds to 1 sample of 893 time steps, each with 463 features. The output shape is (1, 893, 2)
, i.e. (None, None, 2)
.
My network structure looks like this:
model = Sequential()
model.add(Convolution1D(64, 5, input_dim = one_input_length, border_mode = "same", W_regularizer = l2(0.01)))
model.add(MaxPooling1D(10, border_mode = "same"))
model.add(Convolution1D(64, 5, border_mode = "same", W_regularizer = l2(0.01)))
model.add(MaxPooling1D(10, border_mode = "same"))
model.add(GRU(300, return_sequences = True, W_regularizer = l2(0.01), U_regularizer = l2(0.01)))
model.add(TimeDistributed(Dense(2, activation='sigmoid')))
Compiled like this:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
Problem is, when I do model.fit(test_X, test_Y, nb_epochs = ....)
, I get the following error: Incompatible shapes: [1,893] vs. [1,9]
, tracing back to the compile
line.
I logged the shapes of the outputs of each of the layers using this technique, coming up with this:
Input: (1, 893, 463)
Conv_1: (1, 893, 64)
Pool_1: (1, 90, 64)
Conv_2: (1, 90, 64)
Pool_2: (1, 9, 64)
GRU: (1, 9, 300)
Dense: (1, 9, 2)
I suspect this occurs when the model tries to calculate accuracy, and finds that for 893 correct outputs, it only has 9 predictions. For some reason, the second Convolutional1D
layer starts convolving on the time steps, not on the features, as the first one did.
Why is this, and how do I fix this?
Model summary:
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
convolution1d_1 (Convolution1D) (None, None, 64) 148224 convolution1d_input_1[0][0]
____________________________________________________________________________________________________
maxpooling1d_1 (MaxPooling1D) (None, None, 64) 0 convolution1d_1[0][0]
____________________________________________________________________________________________________
convolution1d_2 (Convolution1D) (None, None, 64) 20544 maxpooling1d_1[0][0]
____________________________________________________________________________________________________
maxpooling1d_2 (MaxPooling1D) (None, None, 64) 0 convolution1d_2[0][0]
____________________________________________________________________________________________________
gru_1 (GRU) (None, None, 300) 328500 maxpooling1d_2[0][0]
____________________________________________________________________________________________________
timedistributed_1 (TimeDistribut (None, None, 2) 602 gru_1[0][0]
====================================================================================================
Total params: 497,870
Trainable params: 497,870
Non-trainable params: 0
____________________________________________________________________________________________________
I am trying to make a CNN-LSTM classifier which, given time series data, will give an output for each time step.
Full error message:
Traceback (most recent call last):
File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1021, in _do_call
return fn(*args)
File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1003, in _run_fn
status, run_metadata)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/contextlib.py", line 66, in __exit__
next(self.gen)
File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 469, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [1,893] vs. [1,9]
[[Node: Equal = Equal[T=DT_INT64, _device="/job:localhost/replica:0/task:0/cpu:0"](ArgMax, ArgMax_1)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "Stock_CNN_LSTM.py", line 89, in <module>
model.fit(test_X, test_Y, nb_epoch=nb_epoch, verbose = 2, callbacks=[TestCallback((test_X, test_Y)), ModelCheckpoint("cnn_lstm_model-{epoch:02d}.h5")], initial_epoch = initial_epoch)
File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/keras/models.py", line 672, in fit
initial_epoch=initial_epoch)
File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/keras/engine/training.py", line 1192, in fit
initial_epoch=initial_epoch)
File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/keras/engine/training.py", line 892, in _fit_loop
outs = f(ins_batch)
File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 1900, in __call__
feed_dict=feed_dict)
File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 766, in run
run_metadata_ptr)
File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 964, in _run
feed_dict_string, options, run_metadata)
File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1014, in _do_run
target_list, options, run_metadata)
File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1034, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [1,893] vs. [1,9]
[[Node: Equal = Equal[T=DT_INT64, _device="/job:localhost/replica:0/task:0/cpu:0"](ArgMax, ArgMax_1)]]
Caused by op 'Equal', defined at:
File "Stock_CNN_LSTM.py", line 71, in <module>
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/keras/models.py", line 594, in compile
**kwargs)
File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/keras/engine/training.py", line 713, in compile
append_metric(i, 'acc', acc_fn(y_true, y_pred))
File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/keras/metrics.py", line 11, in categorical_accuracy
K.argmax(y_pred, axis=-1)))
File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 1275, in equal
return tf.equal(x, y)
File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/tensorflow/python/ops/gen_math_ops.py", line 728, in equal
result = _op_def_lib.apply_op("Equal", x=x, y=y, name=name)
File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op
op_def=op_def)
File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2240, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1128, in __init__
self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): Incompatible shapes: [1,893] vs. [1,9]
[[Node: Equal = Equal[T=DT_INT64, _device="/job:localhost/replica:0/task:0/cpu:0"](ArgMax, ArgMax_1)]]
Thanks!
Upvotes: 1
Views: 665
Reputation: 59681
Right now all the convolution/pooling layers are acting over time. If you want to apply them in the features space you would need to make them TimeDistributed
and add an extra dimension at the end of your input. Then you would need to remove the extra dimension before passing the data to the GRU
layer. However, you would only be able to do that if you have one channel output per convolution:
import keras.backend as K
model = Sequential()
model.add(Lambda(lambda x: K.expand_dims(x, -1)))
model.add(TimeDistributed(Convolution1D(1, 5, input_dim = one_input_length, border_mode = "same", W_regularizer = l2(0.01))))
model.add(TimeDistributed(MaxPooling1D(10, border_mode = "same")))
model.add(TimeDistributed(Convolution1D(1, 5, border_mode = "same", W_regularizer = l2(0.01))))
model.add(TimeDistributed(MaxPooling1D(10, border_mode = "same")))
model.add(Lambda(lambda x: K.squeeze(x, -1)))
model.add(GRU(300, return_sequences = True, W_regularizer = l2(0.01), U_regularizer = l2(0.01)))
model.add(TimeDistributed(Dense(2, activation='sigmoid')))
If you want to use multiple output channel in your convolution then you would need to create some kind of "matrix" of GRU units.
Upvotes: 2