Reputation: 9532

Python retrieving the (x,y) coordinates of top left corner of black rectangle on JPEG image

BACKGROUND: I am trying to create a software to mark the answer papers automatically. The answer paper format is fixed and is as below:

enter image description here

PROBLEM: In order to detect the box that user cross (A,B,C?), I need to crop or do a perspective transform based on the 4 black rectangles. How can I retrieve the coordinates of four black rectangles on above image, preferably with OpenCV?

ADDITIONAL INFO: Once I can crop out the potion of answer boxes like below: enter image description here

Since I know the exact dimensions of each boxes, I can compare the black pixels count in each boxes(A,B,C) to check which box user crossed.(Assuming that user do not cross more than one box)

All constructive advices are welcome.

Upvotes: 2

Answers (1)

nico7et8

Reputation: 227

By the look of it, your black squares seem to be the only really black things on the page, all the rest appears lighter. In addition, you "kinda" know where they are, or at least in what general areas : they have their own columns with nothing else above or below. So, assuming your image is an "8bit greyscale", here's how I would go about it without any shape recognition module :

filter the image for all pixels under a certain value (5? 10? 50?) and see if that's enough to filter out everything but the squares.
then switch the mode of your image to black and white (no grays, just a 1bit map) and invert it (black --> white, white -->black)
then I would scan the columns of the image from left to right. For each column, sum all the pixels in said column. As long as the result is zero, you're not there yet. When the sum starts being non-null, you've found a square. The index in the column of the non-zero value corresponds to one of the squares corners (and you can even tell which square if it's in the upper or below part of the column). If you keep scanning, the sum will increase. By looking at the sum, you should be able to deduce the position of each square corners, i.e. : each change in the sum variation is a new corner. 1 big step : all squares perfectly aligned. 4 small steps : squares on a diagonal. Geometrically, unless the paper is torn or folded, the squares left corners will show up either at once, or in order (up to bottom or bottom up). Repeat from right to left.

The analysis of the sum is the trickiest part of course, but a quick look at 2 or 3 exemples will give you a rough estimate that you can use for the others, e.g. : sum between 600 and 800 -> only one square in this column, sum between 1200 and 1600 -> two squares, etc. Of course, there must be more practical solutions with pattern recognition but that's cheating.

Upvotes: 1

Python retrieving the (x,y) coordinates of top left corner of black rectangle on JPEG image

Answers (1)

Related Questions