Reputation: 13
I am trying to read the handwritten or typed text from a form having comb fields as shown in the following image
I tried using Cloud Vision API to read PDF and Handwriting OCR (with DOCUMENT_TEXT_DETECTION/TEXT_DETECTION type) but it is not returning correct data. The field separator(|) is being read as I So, Does Google Cloud Vision API support reading handwritten or typed text from pdf/image having comb fields? Or Is there an option to blur or remove the pipes in between the letters before reading the text?
Upvotes: 0
Views: 697
Reputation: 479
There is a variation of OCR called Intelligent Character Recognition (ICR), that works exactly with that. The boxes actually make it easier to for recognition.
Upvotes: 0
Reputation: 997
There is no option/parameter to specify comb fields in the Vision API request. To improve the results of the handwriting recognition, I would advise to pre-process the image to remove the comb field. And since Vision API is not suited to pre-process images, you will have to do it by yourself in this case, which will require additional coding. One thing you could try is applying a method called Thresholding if the colours of the text and the comb field are different levels of black or different colours whatsoever. Another possible option is to take an identical image with the comb fields but no handwriting text, and perform a subtraction, which will result in an image that just has the handwritten text.
Upvotes: 1