Reputation: 1248

What is the correct way to downsample with laspy?

I am taking this example from laspy docs where the new LasData is created and written to a file:

import laspy
import numpy as np

# 0. Creating some dummy data
my_data_xx, my_data_yy = np.meshgrid(np.linspace(-20, 20, 15), np.linspace(-20, 20, 15))
my_data_zz = my_data_xx ** 2 + 0.25 * my_data_yy ** 2
my_data = np.hstack((my_data_xx.reshape((-1, 1)), my_data_yy.reshape((-1, 1)), my_data_zz.reshape((-1, 1))))

# 1. Create a new header
header = laspy.LasHeader(point_format=3, version="1.2")
header.add_extra_dim(laspy.ExtraBytesParams(name="random", type=np.int32))
header.offsets = np.min(my_data, axis=0)
header.scales = np.array([0.1, 0.1, 0.1])

# 2. Create a Las
las = laspy.LasData(header)

las.x = my_data[:, 0]
las.y = my_data[:, 1]
las.z = my_data[:, 2]
las.random = np.random.randint(-1503, 6546, len(las.points), np.int32)

las.write("new_file.las")

My use case is just slightly different: my_data comes itself from a laz file (let's say, each 10th point), which has its own LasHeader.

I have seen the possibility to create new LasData based on the existing header as:

header = copy(las.header)
d_las = laspy.LasData(header)

However, then I get unmatching array dimensions error due to the fact (I suppose) that point_count in the old header does not match the new data.

Question is then: if I create a laz by taking 10th point of the already-existing laz, should I manually recompute offsets and scales as in example below & manually adjust the point_count in the header? Or is there some more elegant way which updates those automatically based on the new x/y/z data I provide?

Upvotes: 1

Answers (3)

Olivier

Reputation: 18250

Here is a simple way to filter a LAS file (I tested with the simple.laz file included in the laspy project):

import laspy

las = laspy.read('simple.laz')
las.points = las.points[::10].copy()
las.write('simple-filtered.laz')

Check of the result:

import laspy

las = laspy.read('simple.laz')
print("Original file:", las, sep="\n")

las = laspy.read('simple-filtered.laz')
print("Filtered file:", las, sep="\n")

Output:

Original file:
<LasData(1.2, point fmt: <PointFormat(3, 0 bytes of extra dims)>, 1065 points, 0 vlrs)>
Filtered file:
<LasData(1.2, point fmt: <PointFormat(3, 0 bytes of extra dims)>, 107 points, 0 vlrs)>

There are 1065 points in the original file and 107 in the filtered one.

Upvotes: 0

Naveen Kumar

Reputation: 71

When you create a new LasData object by downsampling points (such as taking every 10th point), you'll need to update the header information manually in the laspy library, as the point count, offsets, and scales do not automatically adjust based on the new data.

However, there are ways to make this process more like efficient Update Point Count and Recalculate Offsets and Scales

I have made some change in the code and there's no automatic mechanism to update the header based on new data, it's straightforward to manually adjust the header’s point count, offsets, and scales after downsampling.

import laspy
import numpy as np
from copy import copy

las = laspy.read("existing_file.laz")

indices = np.arange(0, len(las.points), 10)
my_data_xx = las.x[indices]
my_data_yy = las.y[indices]
my_data_zz = las.z[indices]

header = copy(las.header)

header.point_count = len(indices)

header.offsets = np.min([my_data_xx, my_data_yy, my_data_zz], axis=1)
header.scales = np.array([0.1, 0.1, 0.1]) 

d_las = laspy.LasData(header)
d_las.x = my_data_xx
d_las.y = my_data_yy
d_las.z = my_data_zz

d_las.write("downsampled_file.laz")

Upvotes: 0

Prabhat

Reputation: 841

Reusing the header from the original LAS/LAZ file can cause mismatches in the same way as you are encountering with point counts, offsets and scales.

You should try to create a new header


    header = laspy.LasHeader(point_format=3, version="1.2")
    header.offsets = np.min(new_data, axis=0)
    header.scales = np.array([0.1, 0.1, 0.1])

Your complete code should look something like as shown below with few more changes such as "recalculating Offsets and Scales", and "creating a new LasData object".


    import laspy
    import numpy as np
    
    my_data_xx, my_data_yy = np.meshgrid(np.linspace(-20, 20, 15), np.linspace(-20, 20, 15))
    my_data_zz = my_data_xx ** 2 + 0.25 * my_data_yy ** 2
    my_data = np.hstack((my_data_xx.reshape((-1, 1)), my_data_yy.reshape((-1, 1)), my_data_zz.reshape((-1, 1))))
    
    header = laspy.LasHeader(point_format=3, version="1.2")
    header.add_extra_dim(laspy.ExtraBytesParams(name="random", type=np.int32))
    header.offsets = np.min(my_data, axis=0)
    header.scales = np.array([0.1, 0.1, 0.1])
    
    las = laspy.LasData(header)
    
    las.x = my_data[:, 0]
    las.y = my_data[:, 1]
    las.z = my_data[:, 2]
    
    las.random = np.random.randint(-1503, 6546, len(las.points), np.int32)
    
    las.write("new_file.las")

One another dimension named as "random" has been added to store random integer values for each point. Once this is achieved the point data and extra dimension, the LAS file is written to "new_file.las" in order to ensure that all data is formatted corectly and aligned with the header (newly created). This way it solves potential mismatches between the point data and the header metadata.

I hope this code will resolve your query and in case of any other issue related with this code kindly let me know.

Upvotes: 0

What is the correct way to downsample with laspy?

Answers (3)

Related Questions