Ivan
Ivan

Reputation: 1727

Convolutional1D in Keras convolves on time steps instead of features?

So, my input is of shape (1, 893, 463), or, more generally, (None, None, 463). This corresponds to 1 sample of 893 time steps, each with 463 features. The output shape is (1, 893, 2), i.e. (None, None, 2).

My network structure looks like this:

model = Sequential()
model.add(Convolution1D(64, 5, input_dim = one_input_length, border_mode = "same", W_regularizer = l2(0.01)))
model.add(MaxPooling1D(10, border_mode = "same"))
model.add(Convolution1D(64, 5, border_mode = "same", W_regularizer = l2(0.01)))
model.add(MaxPooling1D(10, border_mode = "same"))
model.add(GRU(300, return_sequences = True, W_regularizer = l2(0.01), U_regularizer = l2(0.01)))
model.add(TimeDistributed(Dense(2, activation='sigmoid')))

Compiled like this:

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

Problem is, when I do model.fit(test_X, test_Y, nb_epochs = ....), I get the following error: Incompatible shapes: [1,893] vs. [1,9], tracing back to the compile line.

I logged the shapes of the outputs of each of the layers using this technique, coming up with this:

Input:  (1, 893, 463)
Conv_1: (1, 893, 64)
Pool_1: (1, 90, 64)
Conv_2: (1, 90, 64)
Pool_2: (1, 9, 64)
GRU:    (1, 9, 300)
Dense:  (1, 9, 2)

I suspect this occurs when the model tries to calculate accuracy, and finds that for 893 correct outputs, it only has 9 predictions. For some reason, the second Convolutional1D layer starts convolving on the time steps, not on the features, as the first one did.

Why is this, and how do I fix this?

EDIT:

Model summary:

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to
====================================================================================================
convolution1d_1 (Convolution1D)  (None, None, 64)      148224      convolution1d_input_1[0][0]
____________________________________________________________________________________________________
maxpooling1d_1 (MaxPooling1D)    (None, None, 64)      0           convolution1d_1[0][0]
____________________________________________________________________________________________________
convolution1d_2 (Convolution1D)  (None, None, 64)      20544       maxpooling1d_1[0][0]
____________________________________________________________________________________________________
maxpooling1d_2 (MaxPooling1D)    (None, None, 64)      0           convolution1d_2[0][0]
____________________________________________________________________________________________________
gru_1 (GRU)                      (None, None, 300)     328500      maxpooling1d_2[0][0]
____________________________________________________________________________________________________
timedistributed_1 (TimeDistribut (None, None, 2)       602         gru_1[0][0]
====================================================================================================
Total params: 497,870
Trainable params: 497,870
Non-trainable params: 0
____________________________________________________________________________________________________

I am trying to make a CNN-LSTM classifier which, given time series data, will give an output for each time step.

Full error message:

Traceback (most recent call last):
  File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1021, in _do_call
    return fn(*args)
  File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1003, in _run_fn
    status, run_metadata)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/contextlib.py", line 66, in __exit__
    next(self.gen)
  File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 469, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [1,893] vs. [1,9]
     [[Node: Equal = Equal[T=DT_INT64, _device="/job:localhost/replica:0/task:0/cpu:0"](ArgMax, ArgMax_1)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "Stock_CNN_LSTM.py", line 89, in <module>
    model.fit(test_X, test_Y, nb_epoch=nb_epoch, verbose = 2, callbacks=[TestCallback((test_X, test_Y)), ModelCheckpoint("cnn_lstm_model-{epoch:02d}.h5")], initial_epoch = initial_epoch)
  File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/keras/models.py", line 672, in fit
    initial_epoch=initial_epoch)
  File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/keras/engine/training.py", line 1192, in fit
    initial_epoch=initial_epoch)
  File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/keras/engine/training.py", line 892, in _fit_loop
    outs = f(ins_batch)
  File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 1900, in __call__
    feed_dict=feed_dict)
  File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 766, in run
    run_metadata_ptr)
  File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 964, in _run
    feed_dict_string, options, run_metadata)
  File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1014, in _do_run
    target_list, options, run_metadata)
  File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1034, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [1,893] vs. [1,9]
     [[Node: Equal = Equal[T=DT_INT64, _device="/job:localhost/replica:0/task:0/cpu:0"](ArgMax, ArgMax_1)]]

Caused by op 'Equal', defined at:
  File "Stock_CNN_LSTM.py", line 71, in <module>
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
  File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/keras/models.py", line 594, in compile
    **kwargs)
  File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/keras/engine/training.py", line 713, in compile
    append_metric(i, 'acc', acc_fn(y_true, y_pred))
  File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/keras/metrics.py", line 11, in categorical_accuracy
    K.argmax(y_pred, axis=-1)))
  File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 1275, in equal
    return tf.equal(x, y)
  File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/tensorflow/python/ops/gen_math_ops.py", line 728, in equal
    result = _op_def_lib.apply_op("Equal", x=x, y=y, name=name)
  File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op
    op_def=op_def)
  File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2240, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/Users/user/.pyenvs/MLPy3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1128, in __init__
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Incompatible shapes: [1,893] vs. [1,9]
     [[Node: Equal = Equal[T=DT_INT64, _device="/job:localhost/replica:0/task:0/cpu:0"](ArgMax, ArgMax_1)]]

Thanks!

Upvotes: 1

Views: 665

Answers (1)

javidcf
javidcf

Reputation: 59681

Right now all the convolution/pooling layers are acting over time. If you want to apply them in the features space you would need to make them TimeDistributed and add an extra dimension at the end of your input. Then you would need to remove the extra dimension before passing the data to the GRU layer. However, you would only be able to do that if you have one channel output per convolution:

import keras.backend as K

model = Sequential()
model.add(Lambda(lambda x: K.expand_dims(x, -1)))
model.add(TimeDistributed(Convolution1D(1, 5, input_dim = one_input_length, border_mode = "same", W_regularizer = l2(0.01))))
model.add(TimeDistributed(MaxPooling1D(10, border_mode = "same")))
model.add(TimeDistributed(Convolution1D(1, 5, border_mode = "same", W_regularizer = l2(0.01))))
model.add(TimeDistributed(MaxPooling1D(10, border_mode = "same")))
model.add(Lambda(lambda x: K.squeeze(x, -1)))
model.add(GRU(300, return_sequences = True, W_regularizer = l2(0.01), U_regularizer = l2(0.01)))
model.add(TimeDistributed(Dense(2, activation='sigmoid')))

If you want to use multiple output channel in your convolution then you would need to create some kind of "matrix" of GRU units.

Upvotes: 2

Related Questions