alohapinilla
alohapinilla

Reputation: 55

Keras non Sequential, trouble with dimensions and reshape

I'm trying to do a similar architecture to the one in this example:
https://github.com/fchollet/keras/blob/master/examples/image_ocr.py#L480
however with my data I keep having dimensions problems and i have not found a good site that explains the control of dimensions with your own data and not MNIST or default ones.

Context: im trying the architecture mentioned before with images of text lets say in a first time i try with 2000. For the labels, i decided do to one_hot encoding, this is the data characteristics:
images fixed shape: (2000, 208, 352, 1) #B&W
one_hot labels size: (2000, 346, 1) #2000 samples and 346 classes, the last value is to have a 3dimensional array since its needed for the softmax apparently

and now the code:

nb_classes = 346
max_lin, max_col = (208, 352)
input_shape = ( max_lin, max_col, 1)
conv_filters = 16
kernel_size = (3, 3)
pool_size = 2
time_dense_size = 32
rnn_size = 512
act = 'relu'

input_data = Input(name='the_input', shape=input_shape)
inner = Conv2D(conv_filters, kernel_size, padding='same',
            activation=act, name='CONV2D_1')(input_data)
inner = MaxPooling2D(pool_size=(pool_size, pool_size),
            name='MXPOOL2D_1')(inner)
inner = Conv2D(conv_filters, kernel_size, padding='same',
            activation=act, name='CONV2D_1')(input_data)
inner = MaxPooling2D(pool_size=(pool_size, pool_size),
            name='MXPOOL2D_1')(inner)

#This is my problem, I dont really know how to reshape it with my data,
#I chose (104,2816) because other stuff didnt worked and I found it was 
#the Layer Before (104,176,16) = (104, 176*16) = (104,2816); others values 
#gives me ValueError: total size of new array must be unchanged

conv_to_rnn_dims = (104,2816)
inner = Reshape(target_shape=conv_to_rnn_dims, name='reshape')(inner)

inner = Dense(time_dense_size, activation=act, name='dense1')(inner)
gru_1 = GRU(rnn_size, return_sequences=True, kernel_initializer='he_normal', name='gru1')(inner)
gru_1b = GRU(rnn_size, return_sequences=True, go_backwards=True, kernel_initializer='he_normal', name='gru1_b')(inner)
gru1_merged = add([gru_1, gru_1b])
gru_2 = GRU(rnn_size, return_sequences=True, kernel_initializer='he_normal', name='gru2')(gru1_merged)
gru_2b = GRU(rnn_size, return_sequences=True, go_backwards=True, kernel_initializer='he_normal', name='gru2_b')(gru1_merged)
gru_conc = concatenate([gru_2, gru_2b])
print("GruCOnc: ",gru_conc.shape)
inner = Dense(nb_classes, kernel_initializer='he_normal',
            name='DENSE_2')(gru_conc)
print("2ndDense: ",inner.shape)
y_pred = Activation('softmax',name='softmax')(inner)
print(y_pred.shape)
model = Model(inputs=input_data, outputs=y_pred)
print(model.summary())

sgd = SGD(lr=0.02, decay=1e-6, momentum=0.9, nesterov=True, clipnorm=5)
model.compile(loss='categorical_crossentropy',optimizer=sgd)
model.fit(train_data, train_label, batch_size=10, epochs=2, verbose=1)
score = model.evaluate(x_test, y_test, verbose=1)

print(score)

And after running the code I get:

ValueError: Error when checking target: expected softmax to have shape (None, 104, 346) but got array with shape (2000, 346, 1)

So the big question here is, what is that 104? because the 346 is clearly the number of classes but the other value gets me completely lost.

Thanks everyone for reading my question.

Upvotes: 1

Views: 1384

Answers (1)

  1. conv_to_rnn_dims = (104,2816) This is fictitious. As far as I know you are trying to feed your CNN output to a Dense layer. But the last layer of CNN is MaxPooling which produces 2D output. You should use Flatten to do this connection. Let us check the example.

    model=Sequential()               
    model.add(Conv2D(16,3,3,border_mode="same",input_shape=(208,352,1))
    #Produces 2000 x 208 x 352 x 16
    model.add(Conv2D(32,3,3,activation="tanh",border_mode="valid"))
    #Produces 2000 x 208 x 352 x 32
    model.add(Flatten())
    #Produces 2000 x 2342912
    model.add(Dense(100,activation="sigmoid"))
    #Produces 2000 x 100
    

This means you do not need a reshape layer here.

  1. After this Dense you should use a Reshape to make the output ready for GRU. Now you have 100 timesteps to read. So you should reshape as model.add(Reshape((100,1)) So the outcome of Network is now 2000 x 100 x 1. You can fit this safely to your GRU layer
  2. Finally, for a classification problem with One Hot vectors and a Dense layer at output your Target shape should be 2000 x 346 So the final Dense layer should have 346 nodes.

Upvotes: 1

Related Questions