NOOR E HIRA ISLAM
NOOR E HIRA ISLAM

Reputation: 11

CTC implementation in Keras error

I am working on image OCR with my own dataset, I have 1000 images of variable length and I want to feed in images in form of patches of 46X1. I have generated patches of my images and my label values are in Urdu text, so I have encoded them as utf-8. I want to implement CTC in the output layer. I have tried to implement CTC following the image_ocr example at github. But I get the following error in my CTC implementation.

'numpy.ndarray' object has no attribute 'get_shape'

Could anyone guide me about my mistakes? Kindly suggest the solution for it.

My code is:

X_train, X_test, Y_train, Y_test =train_test_split(imageList, labelList, test_size=0.3)
X_train_patches = np.array([image.extract_patches_2d(X_train[i], (46, 1))for i in range (700)]).reshape(700,1,1) #(Samples, timesteps,dimensions)
X_test_patches = np.array([image.extract_patches_2d(X_test[i], (46, 1))for i in range (300)]).reshape(300,1,1)


Y_train=np.array([i.encode("utf-8") for i in str(Y_train)])
Label_length=1
input_length=1


####################Loss Function########
def ctc_lambda_func(args):
    y_pred, labels, input_length, label_length = args
    # the 2 is critical here since the first couple outputs of the RNN
    # tend to be garbage:
    y_pred = y_pred[:, 2:, :]
    return K.ctc_batch_cost(labels, y_pred, input_length, label_length)

#Building Model

model =Sequential()
model.add(LSTM(20, input_shape=(None, X_train_patches.shape[2]), return_sequences=True))
model.add(Activation('relu'))
model.add(TimeDistributed(Dense(12)))
model.add(Activation('tanh'))
model.add(LSTM(60, return_sequences=True))
model.add(Activation('relu'))
model.add(TimeDistributed(Dense(40)))
model.add(Activation('tanh'))
model.add(LSTM(100, return_sequences=True))
model.add(Activation('relu'))
loss_out = Lambda(ctc_lambda_func, name='ctc')([X_train_patches, Y_train, input_length, Label_length])

Upvotes: 1

Views: 1507

Answers (1)

nemo
nemo

Reputation: 57659

The way CTC is modelled currently in Keras is that you need to implement the loss function as a layer, you did that already (loss_out). Your problem is that the inputs you give that layer are not tensors from Theano/TensorFlow but numpy arrays.

To change that one option is to model these values as inputs to your model. This is exactly what the implementation does that you copied the code from:

labels = Input(name='the_labels', shape=[img_gen.absolute_max_string_len], dtype='float32')
input_length = Input(name='input_length', shape=[1], dtype='int64')
label_length = Input(name='label_length', shape=[1], dtype='int64')
# Keras doesn't currently support loss funcs with extra parameters
# so CTC loss is implemented in a lambda layer
loss_out = Lambda(ctc_lambda_func, output_shape=(1,), name='ctc')([y_pred, labels, input_length, label_length])

To make this work you need to ditch the Sequential model and use the functional model API, exactly as done in the code linked above.

Upvotes: 2

Related Questions