Keras - cnn with several input filters

Question

I'm experimenting with keras and I'm trying to implement deep learning to predict evaluation of patients' exercises. However, I'm stuck for a long time now, trying to fit the network with custom generator.

The intent: Let's say we have patients, coming for an examination. Each patient has to do 9 exercises, while he's (she's) captured by a camera. The input for us is a sequence of 21 points (3 dims each) in time. With another column for timestamp, that means a table of 64 (21*3+1) columns. Each time step is represented by a row of values in the table.
Each patient is therefore represented by nine tables. So the net I'm trying to implement should take nine tables of variable length as an input and output an evaluation of patient, which is a single number.
I followed several guides and ended up with the following.

def get_base_model(input_len, fsize, width):
    input_seq = tf.keras.layers.Input(shape=(input_len, width))

    nb_filters = 10

    convolved = tf.keras.layers.Conv1D(
        nb_filters,
        fsize,
        padding="same",
        activation="tanh"
    )(input_seq)
    processed = tf.keras.layers.GlobalMaxPooling1D()(convolved)

    compressed = tf.keras.layers.Dense(50, activation="tanh")(processed)
    compressed = tf.keras.layers.Dropout(0.3)(compressed)
    model = tf.keras.models.Model(inputs=input_seq, outputs=compressed)

    return model

def main_model(inputs_lens, fsizes=[8, 16, 24]):
    width = Misc.COUNT_OF_POINTS * 3 + 1

    inputs = []
    for i in range(Misc.COUNT_OF_EXERCISES):
        inputs.append(tf.keras.layers.Input(shape=(inputs_lens[i], width)))

    base_nets = []
    for i in range(Misc.COUNT_OF_EXERCISES):
        # TODO down-sampling?
        base_nets.append(get_base_model(inputs_lens[i], fsizes[0], width))

    embeddings = []
    for i in range(Misc.COUNT_OF_EXERCISES):
        embeddings.append(base_nets[i](inputs[i]))

    merged = tf.keras.layers.Concatenate()(embeddings)
    out = tf.keras.layers.Dense(1, activation='sigmoid')(merged)
    model = tf.keras.models.Model(inputs=inputs, outputs=out)

    return model

And later I use it as follows.

        n_outputs = 1
        n_epochs = 10
        batch_size = 1

        inputs_lens = []
        for i in range(Misc.COUNT_OF_EXERCISES):
            inputs_lens.append(patients.get_max_row_count()) // TODO

        net = main_model(inputs_lens)
        net.compile(optimizer='rmsprop', loss='mse', metrics=['accuracy'])

        generator = Generator(patients)

        net.fit(
            generator,
            epochs=n_epochs,
            steps_per_epoch=generator.__len__(),
            verbose=2)

The problem: As far as I know, I need to fit the net with tuples (x, y), where y is a batch of results - an array of evaluation numbers, and x is a batch of inputs - an array of shape batch_size*exercise_count*timesteps*values. I also prepared a generator, providing the batches:

print(generator.getitem(0)[0].shape) // (32, 9, 678, 64) -> (batch_size, exercises, steps, values)
print(generator.getitem(0)[1].shape) // (32,) -> (batch_size,).

However, the net seems to expect only 3 dimensions. As I try to run it, the following error occurs:

ValueError: Input 0 of layer conv1d is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, None, None, None]

along with warnings:

WARNING:tensorflow:Model was constructed with shape (None, 678, 64) for input Tensor("input_1:0", shape=(None, 678, 64), dtype=float32), but it was called on an input with incompatible shape (None, None, None, None).
WARNING:tensorflow:Model was constructed with shape (None, 678, 64) for input Tensor("input_10:0", shape=(None, 678, 64), dtype=float32), but it was called on an input with incompatible shape (None, None, None, None).

Net summary:

print(net.summary())

Model: "model_9"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            [(None, 678, 64)]    0                                            
__________________________________________________________________________________________________
input_2 (InputLayer)            [(None, 678, 64)]    0                                            
__________________________________________________________________________________________________
input_3 (InputLayer)            [(None, 678, 64)]    0                                            
__________________________________________________________________________________________________
input_4 (InputLayer)            [(None, 678, 64)]    0                                            
__________________________________________________________________________________________________
input_5 (InputLayer)            [(None, 678, 64)]    0                                            
__________________________________________________________________________________________________
input_6 (InputLayer)            [(None, 678, 64)]    0                                            
__________________________________________________________________________________________________
input_7 (InputLayer)            [(None, 678, 64)]    0                                            
__________________________________________________________________________________________________
input_8 (InputLayer)            [(None, 678, 64)]    0                                            
__________________________________________________________________________________________________
input_9 (InputLayer)            [(None, 678, 64)]    0                                            
__________________________________________________________________________________________________
model (Model)                   (None, 50)           5680        input_1[0][0]                    
__________________________________________________________________________________________________
model_1 (Model)                 (None, 50)           5680        input_2[0][0]                    
__________________________________________________________________________________________________
model_2 (Model)                 (None, 50)           5680        input_3[0][0]                    
__________________________________________________________________________________________________
model_3 (Model)                 (None, 50)           5680        input_4[0][0]                    
__________________________________________________________________________________________________
model_4 (Model)                 (None, 50)           5680        input_5[0][0]                    
__________________________________________________________________________________________________
model_5 (Model)                 (None, 50)           5680        input_6[0][0]                    
__________________________________________________________________________________________________
model_6 (Model)                 (None, 50)           5680        input_7[0][0]                    
__________________________________________________________________________________________________
model_7 (Model)                 (None, 50)           5680        input_8[0][0]                    
__________________________________________________________________________________________________
model_8 (Model)                 (None, 50)           5680        input_9[0][0]                    
__________________________________________________________________________________________________
concatenate (Concatenate)       (None, 450)          0           model[1][0]                      
                                                                 model_1[1][0]                    
                                                                 model_2[1][0]                    
                                                                 model_3[1][0]                    
                                                                 model_4[1][0]                    
                                                                 model_5[1][0]                    
                                                                 model_6[1][0]                    
                                                                 model_7[1][0]                    
                                                                 model_8[1][0]                    
__________________________________________________________________________________________________
dense_9 (Dense)                 (None, 1)            451         concatenate[0][0]                
==================================================================================================
Total params: 51,571
Trainable params: 51,571
Non-trainable params: 0
__________________________________________________________________________________________________

Any help is appreciated.

DerekG · Accepted Answer

I'm more familiar with the pytorch framework for neural networks but I think the basic layer logic is defined the same way. I believe the issue that you are running into is in your use of Conv1D. The conv layers are not extremely intuitive but it breaks down like this:

Conv1D expects [batch_size,n_feat_vecs,feat_vec_idx] input. It is used to convolve (a 1D kernel) along 1D feature vectors, and you have n_feat_vecs of these for each item in your batch. Thus, your input is 3-dimensional.
Conv2D expects [batch_size,n_feat_maps,feat_map_row,feat_map_column] input. It is used to convolve (a 2D kernel) along 2D feature maps, and you have n_feat_maps of these 2D feature maps for each item in your batch. Thus, the input is 4-dimensional.

Now in your case, your input is currently 4-dimensional, so this input is incompatible with Conv-1D. You have two options for dealing with this:

Use Conv2D - this will not require any input reformatting, but you will have to use a reshape operation after the Conv layers to format the result into a form that dense layers can accept. Using a 2D convolution will convolve a kernel accross both the multiple values for one individual timestep, and the timesteps. If you have reason to believe that the temporal relationship between the timesteps contains useful information exploitable by a kernel convolution, this is the way to go.
Use Conv1D - alternatively, you might decide that you don't want to convolve a 2D kernel across timesteps and values simultaneously. In this case, you need to flatten your 4D input vector into 3 dimensions, probably by stacking all of the values for all timesteps along a single dimension. This will prevent the model from learning some temporal correlation kernel between timesteps by treating the different timesteps separately. The flattening operation should be something along the lines of inputs.reshape(batch_size,n_exercises,-1).

You can decide which of these will be more useful and likely more informative for your particular task. Hope this helps!

Keras - cnn with several input filters

Answers (1)

Related Questions