Reputation: 375
I have many classified lidar point cloud files, which I want to convert to geotiff raster files. For that I wrote a function that creates a json-Pipeline file that is required for conversion with PDAL and then executes that pipeline.
tiles = []
for file in glob.glob("*.las"):
tiles.append(file)
def select_points_and_raster(file, class_nr, resolution):
filename_out = file.split('.')[0]+'_'+ str(do) +'.tif'
config = json.dumps([ file,
{'type':'filters.range', 'limits':classification[class_nr]},
{'resolution':resolution, 'radius':resolution*1.414,
'gdaldriver':'GTiff',
'output_type':['mean'],
'filename':filename_out}
])
pipeline = pdal.Pipeline(config)
pipeline.execute()
return filename_out
for i in range(len(tiles)):
print(f'do file {tiles[i]}')
filename_out = select_points_and_raster(tiles[i], class_nr, resolution)
print(f'finished and wrote {filename_out}')
where classification
is a dictionary containing numbers that correspond to ground/buildings/vegetation, so I don't have to remember the numbers.
This works fine serially by iterating over each file in tiles
. However, as I have many files, I would like to use multiple cores for that. How do I split the task to make use of at least all the four cores I have in my machine? I have tried to do it with the following:
from multiprocess import Pool
ncores = 2
pool = Pool(processes=ncores)
pool.starmap(select_points_and_raster,
[([file for file in tiles], classification[class_nr], resolution)])
pool.close()
pool.join()
but that does not work as I get an AttributeError: 'list' object has no attribute 'split'
.
But I'm not passing a list, or am I? Is that generally the way to go parallelizing that?
Upvotes: 0
Views: 578
Reputation: 1
def select_points_and_raster(input):
file, class_nr, resolution = input
filename_out = file.split('.')[0]+'_'+ str(do) +'.tif'
config = json.dumps([ file,
{'type':'filters.range', 'limits':classification[class_nr]},
{'resolution':resolution, 'radius':resolution*1.414,
'gdaldriver':'GTiff',
'output_type':['mean'],
'filename':filename_out}
])
pipeline = pdal.Pipeline(config)
pipeline.execute()
return filename_out
info = []
for i in range(len(tiles)):
print(f'do file {tiles[i]}')
info.append((tiles[i], class_nr, resolution))
from multiprocess import Pool
ncores = 2 # ncores = cpu_count() - 1
pool = Pool(ncores)
pool.map(select_points_and_raster, input)
pool.close()
pool.join()
This is what works for me. You seem to be passing a list inside your tuple list: [([],,)] which is passing a list of file names. You also seem to be passing different inputs: classification[class_nr] instead of just class_nr.
Upvotes: 0