Mikisf
Mikisf

Reputation: 21

Significant difference in KD-Tree density computation times with two similar point clouds in Open3D

I'm working with two different point clouds using Open3D, both containing 50,000,000 points. I'm calculating the point cloud density by building a KD-Tree and then querying the nearest neighbors. However, I'm encountering unexpected behavior: the first point cloud computes density in under 1 second, while the second point cloud takes over 2 minutes, despite both having the same number of points.

As far as I understand, after the KD-Tree is built, querying the closest point should have a time complexity of O(log ⁡n), so I would expect the time for density computation to be similar for both point clouds. However, that is not the case.

Here is my code:

import open3d as o3d
import random
import time

def o3d_compute_density(point_cloud):
    print("Total number of points:", f'{len(point_cloud.points):,}'.replace(',', '.'))

    init_time = time.time()
    kdtree = o3d.geometry.KDTreeFlann(point_cloud)
    print("Time to create kdtree:", time.time() - init_time)

    random_indexes = [random.randint(0, len(point_cloud.points) - 1) for _ in range(min(len(point_cloud.points), 1_000))]

    init_time = time.time()
    sum = 0
    for i in random_indexes:
        point = point_cloud.points[i]
        [_, _, dists] = kdtree.search_knn_vector_3d(point, 2)
        sum += pow(dists[1], 1/2)

    print("Time to calculate density:", time.time() - init_time)
    return sum / len(random_indexes)

# Reading point clouds
pcd1 = o3d.io.read_point_cloud("./pcd1.ply", print_progress=True)
density1 = o3d_compute_density(pcd1)
print("Density:", density1)

pcd2 = o3d.io.read_point_cloud("./pcd2.ply", print_progress=True)
density2 = o3d_compute_density(pcd2)
print("Density:", density2)

Outputs:

For the first point cloud:

Total number of points: 50.000.000  
Time to create kdtree: 20.79 seconds  
Time to calculate density: 0.015 seconds  
Density: 0.0020725614732701775

For the second point cloud:

Total number of points: 50.000.000  
Time to create kdtree: 16.67 seconds  
Time to calculate density: 137.73 seconds  
Density: 0.0024792745855749337

My questions:

  1. Why is there such a large discrepancy in the time it takes to calculate density between the two point clouds, even though they both have the same number of points?

  2. Could this be related to the structure of the point clouds themselves (e.g., distribution of points or point cloud properties)? If so, what would be a good way to analyze that?

  3. Are there any Open3D-specific optimizations I can apply to improve performance consistency across different point clouds?

Any insights or suggestions would be greatly appreciated!

You can download the point clouds I'm using from the following link

Upvotes: 2

Views: 151

Answers (1)

el_grezeq
el_grezeq

Reputation: 133

There's more than one way to build a kd-tree and even the same technique can in sometimes results in very different tree shapes (depth, number of leaves, size, and average distance to the root, etc.). This of course has an influence on the knn search process.

Open3D implementation relies on FLANN. On their webpage it's said that "FLANN is a library for performing fast approximate nearest neighbor searches in high dimensional spaces. It contains a collection of algorithms we found to work best for nearest neighbor search and a system for automatically choosing the best algorithm and optimum parameters depending on the dataset.".

So my guess is that this has to do with FLANN's implementation of kd-trees.

You may look in detail to the research papers provided on FLANN's webpage or in their Github page for implementation details (and maybe ask your question there?).

Otherwise Scipy or Scikit-learn both offer kd-trees implementations that may be worth trying out (to see if there's a difference in behovior).

Upvotes: 0

Related Questions