Reputation: 651
I need to detect and decode a relatively small QR code (110x110 pixels) in a large image (2500x2000) on a Raspberry Pi. The QR code can be at any location in the frame, but the orientation is expected to be normal, i.e. top-up. We are using high quality industrial cameras and lenses, so images are generally good quality and in focus.
Currently, I am able to detect and decode the image reliably with pyzbar
when I crop the image around the QR code using a window of aprox 600x500. If I attempt to decode the full image, the symbol is not detected/decoded.
I have written a loop that slides a crop window over the image, and attempts to decode each cropped frame separately. I move the window by 50% each iteration to ensure I don't miss any symbols at the edge of the window.
I have also tried using OpenCV for detection/decoding but the performance was no better than with pyzbar
Problems which affect my current project:
The sliding window approach is difficult to tune, inefficient and slow b/c:
Problems that may affect other projects where I would use this approach:
How can I find the approximate location of the QR code(s) so I can crop the image accordingly?
I am interested in any solutions to improve the detection/decoding performance, but prefer ones that (a) use machine learning techniques (I'm a ML newbie but willing to learn), (b) use OpenCV image pre-processing or (c) make improvements to my basic cropping algorithm.
Here is one of the sample images that I'm using for testing. It's purposely poor lighting quality to approximate the worst case scenario, however the individual codes still detect and decode correctly when cropped.
Upvotes: 17
Views: 15597
Reputation: 811
I think I have found a simple yet reliable way in which the corners of the QR code can be detected. However, my approach assumes there is some contrast (the more the better) between the QR and its surrounding area. Also, we have to keep in mind that neither pyzbar
nor opencv.QRCodeDetector
are 100% reliable.
So, here is my approach:
pyzbar
is not completely scale invariant. Although I don't have references that can back this claim, I still use small to medium images for barcode detection as a rule of thumb. You can skip this step as it might seem completely arbitrary.image = cv2.imread("image.jpg")
scale = 0.3
width = int(image.shape[1] * scale)
height = int(image.shape[0] * scale)
image = cv2.resize(image, (width, height))
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
_, thresh = cv2.threshold(gray, 120, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
3. Dilation + contours. This step is a little bit trickier and I do apologize if my english is not completely clear here. We can see from the previous image that there are black spaces in between the white inside the QR code. If we were to just find the contours, then opencv will assume these spaces are separate entities and not part of a whole. If we want to transform the QR code and make it seem as just a white square, we have to do a bit of morphological operations. Namely, we have to dilate the image.
# The bigger the kernel, the more the white region increases.
# If the resizing step was ignored, then the kernel will have to be bigger
# than the one given here.
kernel = np.ones((3, 3), np.uint8)
thresh = cv2.dilate(thresh, kernel, iterations=1)
contours, _ = cv2.findContours(thresh, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
4. Filtering and getting bounding boxes. Most of the found contours are too small to contain a barcode, so we have to filter them in order to make our search space smaller. After filtering out the weak candidates, we can fetch the bounding boxes of the strong ones.
EDIT: In this case we are filtering by area (small area = weak candidate), but we can also filter by the extent of the detection. Basically what the extent measures is the rectangularity of an object, and we can use that information since we know a QR code is a square. I chose the extent to be greater than pi / 4, since that is the extent of a perfect circle, meaning we are also filtering out circular objects.
bboxes = []
for cnt in contours:
area = cv2.contourArea(cnt)
xmin, ymin, width, height = cv2.boundingRect(cnt)
extent = area / (width * height)
# filter non-rectangular objects and small objects
if (extent > np.pi / 4) and (area > 100):
bboxes.append((xmin, ymin, xmin + width, ymin + height))
5. Detect barcodes. We have reduced our search space to just the actual QR codes! Now we can finally use pyzbar
without worrying too much about it taking too long to do barcode detection.
qrs = []
info = set()
for xmin, ymin, xmax, ymax in bboxes:
roi = image[ymin:ymax, xmin:xmax]
detections = pyzbar.decode(roi, symbols=[pyzbar.ZBarSymbol.QRCODE])
for barcode in detections:
info.add(barcode.data)
# bounding box coordinates
x, y, w, h = barcode.rect
qrs.append((xmin + x, ymin + y, xmin + x + w, ymin + y + height))
Unfortunately, pyzbar
was only able to decode the information of the largest QR code (b'3280406-001'), even though both barcodes were in the search space. With regard to knowing how many times was a particular code detected, you can use a Counter
object from the collections
standard module. If you don't mind having that information, then you can just use a set as I did here.
Hope this could be of help :).
Upvotes: 22
Reputation: 5304
This solution is quite inefficient as it's rather on the brute force side, but should work given: The image size is 2000x2000, the detection threshold is roughly 500x500 and the barcode size is 110x110.
Theoretically it's better than the sliding window approach, but probably not by much, as it's more of a variation.
The idea is that if we cannot reliably find the barcode within the image, but we can find it within a subsection of the image, then we can try to break the image into subsections, of which one will contain the image. Since we can't be sure the barcode won't be broken up when we split it into sections, we must be through and ensure it will be within one of the possible sections.
First, split the image up into 500x500 grids (4x4) and run pyzbar on each of them. Theory: The barcode is either within one of those 16 grids or it it is split up by one of those grids.
If that fails to find a barcode, then offset the grid by 250 on the x axis. Run it again, then try the grid offset by 250 on the y axis, run it again, then try 250 on both the x and the y axis and run it again. In theory, this should ensure the barcode exists within some 500x500 or smaller grid. If that doesn't work, recommend using a 250x250 grid instead (of course that will take 4x longer to run), as 250 is still more than twice the barcode size.
Other suggestions provided here can be used to narrow down the search area and discard sections that absolutely cannot contain a barcode or focus on sections that likely contain a barcode. Sebastian's answer would likely work rather well in determining what grid squares to focus on.
Another option might be to consider where the barcode most likely is, I would expect the barcode to be closer to the center of the screen. So a spiraling search pattern going inside out might help.
Upvotes: 0