Mainmeister
Mainmeister

Reputation: 21

Find the rectangle being clicked on in png with python

I have a png containing rectangles. I want to display this png so that the user can click inside one of the rectangles and the returned value is the top left and bottom right coordinates in pixels of the rectangle.

Here is a representative image. Warehouse shelving floor plan

I've tried to figure out how to do this in opencv using the blob detection and mapping the mouse click coordinates to one of the blobs but I keep hitting brick walls with getting opencv to install and work. Is opencv the right library to use in this case?

The following code only finds about 8 blobs

# Standard imports
import cv2
import numpy as np

# Read image
im = cv2.imread("/home/mainmeister/PycharmProjects/WarehousrLineMap/public/warehouse.png", cv2.IMREAD_GRAYSCALE)

# Set up the detector with default parameters.
parameters = cv2.SimpleBlobDetector_Params()
parameters.filterByColor = 1
parameters.blobColor = 255
#detector = cv2.SimpleBlobDetector()
detector = cv2.SimpleBlobDetector_create(parameters)

# Detect blobs.
keypoints = detector.detect(im)

# Draw detected blobs as red circles.
# cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS ensures the size of the circle corresponds to the size of blob
#im_with_keypoints = cv2.drawKeypoints(im, keypoints, np.array([]), (0,0,255), cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
im_with_keypoints = cv2.drawKeypoints(im, keypoints, np.array([]),(255,0,0),cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)

# Show keypoints
cv2.imshow("Keypoints", im_with_keypoints)
cv2.waitKey(0)

Upvotes: 1

Views: 200

Answers (3)

Christoph Rackwitz
Christoph Rackwitz

Reputation: 15575

Apply "connected components labeling" (with stats). This may require some preprocessing (thresholding). The labeling gives you an "image" containing a "label" (number) for each pixel.

Then if you click on any pixel, you can immediately look up its bounding box.

enter image description here

Video: https://imgur.com/a/vSnnjSE (the green box is made larger for visualization)

Some of the core logic:

floorplan = cv.imread("14rgT.png")
(height, width) = floorplan.shape[:2]

lower = (192,) * 3
upper = (255,) * 3
mask = cv.inRange(floorplan, lower, upper)

(nlabels, labels, stats, centroids) = cv.connectedComponentsWithStats(mask)

# the_label = labels[y,x] # mouse click coordinates
# (x0,y0,w,h,area) = stats[the_label] # bounding box, true area in pixels

The rest is GUI.

This is sensitive to "rooms" having a proper, contiguous perimeter. Any holes in the walls will connect rooms.

You can fix that up to some extent with morphology operations.

Upvotes: 2

tintin98
tintin98

Reputation: 101

I would suggest get the corners in the image first using Harris Corner detection (OpenCV example).

Corner Image

As you can see the corners are detected properly. If not I would suggest play with the parameters documented in OpenCV.

NOTE: The above is just for demonstration and you may need some pre-processing such as erosion and/or noise removal (if required) as well but that is empirical and I am leaving it to you.

Now just get the co-ordinate of the mouse-click (check Fatema's answer how tl_x and tl_y is obtained) and check which of the corners that you have obtained from the Harris-Corner detection step is closest. If you want the top-left then check only in above and left of the point of click.

If there is a possibility of spurious clicks (by mistake) then add a threshold on the distance from the point of click in x-y direction up to which you will check for corners.

Alternate approach I haven't used this but you can try out the scikit-image implementation of Harris Corner detection and then extract peaks from the image as shown here using corner_peaks. This looks to be more convenient.

Upvotes: 0

Fatema_
Fatema_

Reputation: 31

Instead of detecting blobs, you can display the image with a callback function that captures mouse clicks. When you click on the image, you can record the top-left and bottom-right coordinates of the rectangle you clicked on.

Here's how you can do it:

 import cv2

def mouse_callback(event, x, y, flags, param):
    global tl_x, tl_y, br_x, br_y, drawing
if event == cv2.EVENT_LBUTTONDOWN:
    drawing = True
    tl_x, tl_y = x, y
elif event == cv2.EVENT_LBUTTONUP:
    drawing = False
    br_x, br_y = x, y
    cv2.rectangle(img, (tl_x, tl_y), (br_x, br_y), (0, 255, 0), 2)
    cv2.imshow("Image", img)
    print(f"Top-left coordinates: ({tl_x}, {tl_y}), Bottom-right coordinates: ({br_x}, {br_y})")

# Read image
img = cv2.imread("G:\ITI\yy.png")

# Create a window to display the image
cv2.namedWindow("Image")

# Initialize variables
tl_x, tl_y, br_x, br_y = -1, -1, -1, -1
drawing = False

# Set the mouse callback function
cv2.setMouseCallback("Image", mouse_callback)

# Display the image
cv2.imshow("Image", img)

# Wait until a key is pressed
cv2.waitKey(0)

# Destroy all OpenCV windows
cv2.destroyAllWindows()

sample for output

Top-left coordinates: (489, 388), Bottom-right coordinates: (489, 388)
Top-left coordinates: (284, 329), Bottom-right coordinates: (284, 329)
Top-left coordinates: (200, 151), Bottom-right coordinates: (200, 151)

this is image for output enter image description here

Upvotes: 0

Related Questions