vincent
vincent

Reputation: 1678

Segment, crop (bounding boxes) and labelling characters with openCV

l have a set of images which represent a sequence of characters. l'm wonderning whether OpenCV or other techniques can segment and crop each character from the image. for instance :

l have as input

enter image description here

l want to get :

enter image description here is 5

enter image description here is 0

enter image description here is 4

enter image description here is 1

enter image description here is 9

enter image description here is 2

Upvotes: 0

Views: 989

Answers (2)

Tides
Tides

Reputation: 121

If you want to segment the numbers, I would first try to play with opening operations (because your letters are black on a white background, it would be closing if it was the opposite) in order to fill the holes that you have in your numbers. Then I would project vertically the pixels and analyze the shape that you get. If you find the valley points in this projected shape you will get the vertical limits between characters. You can do the same horizontally to get the upper and bottom limits of your chars. This approach will only work if the text is horizontal.

Then you could use an standard OCR library or go for deep learning. Since these number appear to be from MNIST dataset, you will find a lot of examples to do OCR using deep learning or other techniques with this dataset:

http://yann.lecun.com/exdb/mnist/

Upvotes: 1

Soltius
Soltius

Reputation: 2263

You have two problems here for going from your input to your output :

The first is seperating your characters. If your images always look like this, with numbers neatly seperated, then you should have no problem at all seperating them using findContours or connectedComponents, maybe along with a bounding box function like minAreaRect.

The second problem is once you have seperated your digits, how to tell which digit the image represents. This problem has a name : OCR.
If you have a lot of images, it is also possible to train a classification algorithm, as your tagging of this question suggests. The "hot topic" right now is deep learning with neural networks, but for simple applications, regular machine learning classification with hand-designed features might do the trick.

Upvotes: 2

Related Questions