Reputation: 79
I have rgb image (let's call it test.png ) and corresponding 3D cloud points (extracted using stereo camera). Now, I want to use depth information to train my neural network.
Format for 3D point cloud is
.PCD v.7 - Point Cloud Data file format
FIELDS x y z rgb index
SIZE 4 4 4 4 4
TYPE F F F F U
COUNT 1 1 1 1 1
WIDTH 253674
HEIGHT 1
VIEWPOINT 0 0 0 1 0 0 0
POINTS 253674
DATA ascii
How can I extract depth information from the point cloud and instead of using rgb image I can add one more channel for depth and use RGBD image to train my network?
For example: point cloud information (FIELDS) for two pixels is given as:
1924.064 -647.111 -119.4176 0 25547
1924.412 -649.678 -119.7147 0 25548
According to the description, they're point in space that intersects that pixel (from test.png) has x, y, and z coordinates (relative to the base of the robot that was taking the images, so for our purposes we call this "global space"). (From the Cornell grasp dataset)
You can tell which pixel each line refers to by the final column in each line (labelled "index").
That number is an encoding of the row and column number of the pixel. In all of our images,
there are 640 columns and 480 rows. Use the following formulas to map an index to a row, col pair.
Note that index = 0 maps to row 1, col 1.
row = floor(index / 640) + 1
col = (index MOD 640) + 1
Upvotes: 3
Views: 4000
Reputation: 333
It seems that the file was saved in a processed manner and not directly in (col, row, depth). As the documentation mentions, we can restore the distance from the center, by:
row = floor(index / 640) + 1
col = (index MOD 640) + 1
Notice that not all pixels are valid - so instead of 640x480 pixels the files have about 80% of the data - resulting in an "unorganized cloud".
import os
import math
import numpy as np
from PIL import Image
pcd_path = "/path/to/pcd file"
with open(pcd_path, "r") as pcd_file:
lines = [line.strip().split(" ") for line in pcd_file.readlines()]
img_height = 480
img_width = 640
is_data = False
min_d = 0
max_d = 0
img_depth = np.zeros((img_height, img_widht), dtype='f8')
for line in lines:
if line[0] == 'DATA': # skip the header
is_data = True
continue
if is_data:
d = max(0., float(line[2]))
i = int(line[4])
col = i % img_width
row = math.floor(i / img_width)
img_depth[row, col] = d
min_d = min(d, min_d)
max_d = max(d, max_d)
max_min_diff = max_d - min_d
def normalize(x):
return 255 * (x - min_d) / max_min_diff
normalize = np.vectorize(normalize, otypes=[np.float])
img_depth = normalize(img_depth)
img_depth_file = Image.fromarray(img_depth)
img_depth_file.convert('RGB').save(os.path.join("path/to/output", 'depth_img.png'))
The result image:
Where the original image looks like this:
Upvotes: 1