Reputation: 917

Segmenting text from images

I want to extract certain type of text from images of ID cards:

As you can see, they have various lighting and sharpness conditions. Ultimate goal is to recognize the black texts. If they're well separated, I've managed to do it well with Tesseract OCR (this is VIE language by the way, in case you'd like to try it yourself with Tesseract). However, in the above examples, there are overlapped of the black texts and the blue texts, which confused Tesseract. So my current goal is to cleanly remove them, while not heavily distort the black blurry pixels so that Tesseract still works.

What are the most robust ways to do this? (Code examples in Python would be appreciated if possible.)

Upvotes: 0

Answers (1)

Cerovec

Reputation: 1313

You can try image segmentation using the color. If the color of a pixel is in the RGB area close to (0, 0, 0), then this pixel is likely a candidate to be a part of the relevant black text.

Another approach would be to check the Chrominance component of each pixel. The assumption is that black text has lower Chrominance and that this is the relevant piece of the picture.

The idea is to figure out parts of the image where likely candidates for relevant text are present, and then just white out whatever's not relevant.

However, these are quick and dirty solutions and they struggle when ID cards are photographed in different lighting situations, or if they are damaged, or if the devices used to capture photos have a wide range of cameras. or if there are slight variations in types of ID cards. We've worked on this problem quite a lot, specifically on ID documents. Eventually, our solution was to use machine learning on a large number of generated images and train the ML models to return just the relevant text from ID cards. It required a huge amount of work, but it paid off as we now have very reliable data extraction, and that includes IDs from Vietnam.

Disclaimer: I'm working at Microblink, where we develop commercial OCR products, one of them being for ID scanning.

Upvotes: 1

Segmenting text from images

Answers (1)

Related Questions