How to give own dataset to keras image_ocr

Question

I am aware of the keras image_ocr model. It uses image generator to generate images, however, I am facing some difficulties since I am trying to give my own dataset to the model for training.vi

The repo link is: https://github.com/fchollet/keras/blob/master/examples/image_ocr.py

I have created arrays: x and y. My image paths and their corresponding gt is in a csv file.

x is given the dimension of images as: [nb_samples, w, h, c]

y is given the labels which is a string, the gt.

Here is the code I am using to pre-process:

for i in range(0,len(read_file)):
    path = read_file['path'][i]
    label = read_file['gt'][i]
    path = path.strip('
')
    img = cv2.imread(path,0)
    #Re-sizing the images
    #height = 64, width = 128
    #res_img = cv2.resize(img, (128,64))
    #cv2.imwrite(i,res_img)
    h,w =  img.shape
    x.append(img)
    y.append(label)
    size = img.size
    """
    print "Height: ", h #Height
    print "Width: ", w #Width
    print "Channel: ", c #Channel
    print "Size: ", size
    print "
"
    """
print "H: ", h
print "W: ", w
print "S: ", size

x = np.array(x).astype(np.float32)
y = np.array(y)

x_train, x_test, y_train, y_test = train_test_split(x,y,test_size=0.3,random_state=42)

x_train = np.array(x_train).astype(np.float32)
y_train = np.array(y_train)
x_train = np.array(x_train)
x_test = np.array(x_test)
y_test = np.array(y_test)

print "Printing the shapes. 
"
print "X_train shape: ", x_train.shape
print "Y_train shape: ", y_train.shape
print "X_test shape: ", x_test.shape
print "Y_test shape: ", y_test.shape
print "
"

It is followed by the keras image_ocr code. The total code is here: https://gist.github.com/kjanjua26/b46388bbde9ded5cf1f077a9f0dedc4f

The error when I run this is:

`Traceback (most recent call last):
 File "preprocess.py", line 323, in 
 train(run_name, 0, 20, w)
 File "preprocess.py", line 314, in train
 model.fit(next_train(x_train), y_train, batch_size=7, epochs=20,       verbose=1, validation_split=0.1, shuffle=True, initial_epoch=0)
 File "/home/kamranjanjua/anaconda2/lib/python2.7/site-  packages/keras/engine/training.py", line 1358, in fit
batch_size=batch_size)
 File "/home/kamranjanjua/anaconda2/lib/python2.7/site-packages/keras/engine/training.py", line 1234, in _standardize_user_data
exception_prefix='input')
 File "/home/kamranjanjua/anaconda2/lib/python2.7/site-packages/keras/engine/training.py", line 100, in _standardize_input_data
'Found: ' + str(data)[:200] + '...')
 TypeError: Error when checking model input: data should be a Numpy array, or list/dict of Numpy arrays. Found: ...`

Any help would be appreciated.

How to give own dataset to keras image_ocr

Answers (1)

Related Questions