Reputation: 1
I'm trying to use Open3D for image recognition. I want to detect the block, then put a square around it (I have a ply with 524288 points). I'm using the Intel real sense d435 as the camera.
I turned the video (the images of the video) into a NumPy array. I get the interstices of the depth of frames, and create a Open3D image. I create a point cloud from the depth of the image, and flip the point cloud to align with the camera view. Then compute the AABB. Get the bounding box coordinates (I want a box around the block to check that the program is running correctly), then draw the box.
while True:
# Wait for a coherent pair of frames: depth and color
frames = pipeline.wait_for_frames()
depth_frame = frames.get_depth_frame()
color_frame = frames.get_color_frame()
if not depth_frame or not color_frame:
continue.
# Convert images to numpy arrays
depth_image = np.asanyarray(depth_frame.get_data())
color_image = np.asanyarray(color_frame.get_data())
# Create an Open3D image from the depth image
depth_o3d = o3d.geometry.Image(depth_image)
color_o3d = o3d.geometry.Image(color_image)
# Get the intrinsics of the depth frame
intrinsics = depth_frame.profile.as_video_stream_profile().intrinsics
pinhole_camera_intrinsic = o3d.camera.PinholeCameraIntrinsic(
intrinsics.width, intrinsics.height, intrinsics.fx, intrinsics.fy, intrinsics.ppx, intrinsics.ppy
)
# Create a point cloud from the depth image
rgbd_image = o3d.geometry.RGBDImage.create_from_color_and_depth(
color_o3d, depth_o3d, convert_rgb_to_intensity=False
)
pcd = o3d.geometry.PointCloud.create_from_rgbd_image(
rgbd_image, pinhole_camera_intrinsic
)
# Flip the point cloud to align with the camera view
pcd.transform([[1, 0, 0, 0],
[0, -1, 0, 0],
[0, 0, -1, 0],
[0, 0, 0, 1]])
# Compute the axis-aligned bounding box (AABB)
aabb = pcd.get_axis_aligned_bounding_box()
aabb.color = (1, 0, 0) # Set the color of the bounding box to red
# Get the bounding box coordinates
min_bound = aabb.get_min_bound()
max_bound = aabb.get_max_bound()
# Convert the bounding box coordinates to pixel coordinates
min_bound_pixel = rs.rs2_project_point_to_pixel(intrinsics, min_bound)
max_bound_pixel = rs.rs2_project_point_to_pixel(intrinsics, max_bound)
# Draw the bounding box on the color image
cv2.rectangle(color_image, (int(min_bound_pixel[0]), int(min_bound_pixel[1])),
(int(max_bound_pixel[0]), int(max_bound_pixel[1])), (0, 255, 0), 2)
# Display the color image with the bounding box
cv2.imshow("color window", color_image)
# Break the loop if 'q' is pressed
if cv2.waitKey(1) & 0xFF == ord('q'):
break
But the code I have isn't detecting the block. How can I fix that?
Upvotes: 0
Views: 34