How do I use Open3D to use it for image recognition?

Question

I'm trying to use Open3D for image recognition. I want to detect the block, then put a square around it (I have a ply with 524288 points). I'm using the Intel real sense d435 as the camera.

I turned the video (the images of the video) into a NumPy array. I get the interstices of the depth of frames, and create a Open3D image. I create a point cloud from the depth of the image, and flip the point cloud to align with the camera view. Then compute the AABB. Get the bounding box coordinates (I want a box around the block to check that the program is running correctly), then draw the box.

while True:
    # Wait for a coherent pair of frames: depth and color
    frames = pipeline.wait_for_frames()
    depth_frame = frames.get_depth_frame()
    color_frame = frames.get_color_frame()
    
    if not depth_frame or not color_frame:
        continue. 

    # Convert images to numpy arrays
    depth_image = np.asanyarray(depth_frame.get_data())
    color_image = np.asanyarray(color_frame.get_data())

    # Create an Open3D image from the depth image
    depth_o3d = o3d.geometry.Image(depth_image)
    color_o3d = o3d.geometry.Image(color_image)

    # Get the intrinsics of the depth frame
    intrinsics = depth_frame.profile.as_video_stream_profile().intrinsics
    pinhole_camera_intrinsic = o3d.camera.PinholeCameraIntrinsic(
        intrinsics.width, intrinsics.height, intrinsics.fx, intrinsics.fy, intrinsics.ppx, intrinsics.ppy
    )

    # Create a point cloud from the depth image
    rgbd_image = o3d.geometry.RGBDImage.create_from_color_and_depth(
        color_o3d, depth_o3d, convert_rgb_to_intensity=False
    )
    pcd = o3d.geometry.PointCloud.create_from_rgbd_image(
        rgbd_image, pinhole_camera_intrinsic
    )

    # Flip the point cloud to align with the camera view
    pcd.transform([[1, 0, 0, 0],
                   [0, -1, 0, 0],
                   [0, 0, -1, 0],
                   [0, 0, 0, 1]])

    # Compute the axis-aligned bounding box (AABB)
    aabb = pcd.get_axis_aligned_bounding_box()
    aabb.color = (1, 0, 0)  # Set the color of the bounding box to red

    # Get the bounding box coordinates
    min_bound = aabb.get_min_bound()
    max_bound = aabb.get_max_bound()

    # Convert the bounding box coordinates to pixel coordinates
    min_bound_pixel = rs.rs2_project_point_to_pixel(intrinsics, min_bound)
    max_bound_pixel = rs.rs2_project_point_to_pixel(intrinsics, max_bound)

    # Draw the bounding box on the color image
    cv2.rectangle(color_image, (int(min_bound_pixel[0]), int(min_bound_pixel[1])), 
                  (int(max_bound_pixel[0]), int(max_bound_pixel[1])), (0, 255, 0), 2)

    # Display the color image with the bounding box
    cv2.imshow("color window", color_image)

    # Break the loop if 'q' is pressed
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

But the code I have isn't detecting the block. How can I fix that?

How do I use Open3D to use it for image recognition?

Answers (0)

Related Questions