user3165156
user3165156

Reputation: 79

How to extract depth information from the 3D point cloud data?

I have rgb image (let's call it test.png ) and corresponding 3D cloud points (extracted using stereo camera). Now, I want to use depth information to train my neural network.

Format for 3D point cloud is

.PCD v.7 - Point Cloud Data file format
FIELDS x y z rgb index
SIZE 4 4 4 4 4
TYPE F F F F U
COUNT 1 1 1 1 1
WIDTH 253674
HEIGHT 1
VIEWPOINT 0 0 0 1 0 0 0
POINTS 253674
DATA ascii

How can I extract depth information from the point cloud and instead of using rgb image I can add one more channel for depth and use RGBD image to train my network?

For example: point cloud information (FIELDS) for two pixels is given as:

1924.064 -647.111 -119.4176 0 25547  
1924.412 -649.678 -119.7147 0 25548

According to the description, they're point in space that intersects that pixel (from test.png) has x, y, and z coordinates (relative to the base of the robot that was taking the images, so for our purposes we call this "global space"). (From the Cornell grasp dataset)

You can tell which pixel each line refers to by the final column in each line (labelled "index").
That number is an encoding of the row and column number of the pixel. In all of our images, there are 640 columns and 480 rows. Use the following formulas to map an index to a row, col pair. Note that index = 0 maps to row 1, col 1.

row = floor(index / 640) + 1

col = (index MOD 640) + 1

Upvotes: 3

Views: 4000

Answers (1)

Jenny
Jenny

Reputation: 333

It seems that the file was saved in a processed manner and not directly in (col, row, depth). As the documentation mentions, we can restore the distance from the center, by:

row = floor(index / 640) + 1
col = (index MOD 640) + 1

Notice that not all pixels are valid - so instead of 640x480 pixels the files have about 80% of the data - resulting in an "unorganized cloud".

import os
import math
import numpy as np
from PIL import Image


pcd_path = "/path/to/pcd file"
with open(pcd_path, "r") as pcd_file:
    lines = [line.strip().split(" ") for line in pcd_file.readlines()]

img_height = 480
img_width = 640
is_data = False
min_d = 0
max_d = 0
img_depth = np.zeros((img_height, img_widht), dtype='f8')
for line in lines:
    if line[0] == 'DATA':  # skip the header
        is_data = True
        continue
    if is_data:
        d = max(0., float(line[2]))
        i = int(line[4])
        col = i % img_width
        row = math.floor(i / img_width)
        img_depth[row, col] = d
        min_d = min(d, min_d)
        max_d = max(d, max_d)

max_min_diff = max_d - min_d


def normalize(x):
    return 255 * (x - min_d) / max_min_diff
normalize = np.vectorize(normalize, otypes=[np.float])
img_depth = normalize(img_depth)
img_depth_file = Image.fromarray(img_depth)
img_depth_file.convert('RGB').save(os.path.join("path/to/output", 'depth_img.png'))

The result image:

enter image description here

Where the original image looks like this:

enter image description here

Upvotes: 1

Related Questions