Reputation: 1
I am writing a function to bin points based on their angle in a radial coordinate system. I would like to have the option to perform some nonlinear downsampling of the points in each bin (computing the median coordinate, min or max coordinate based on distance). I am able to split my array into views of each bin, but since they vary in size, I have not been able to find a way to operate on each slice while fully utilizing vectorization.
I have achieved my best solution by sorting the points by angle, computing a quantized copy, and identifying the indices between quantized gaps. I then split the sorted array with the index key.
At this point, I would like to be able to compute metrics for each bin without using loops. I can't simply concatenate each slice into a 3D array since they are inhomogeneous. The way I have achieved this so far is by building an array of NaNs of size [num_slices, length_of_largest_slice, 2], and populating the array along axis 0 with each slice, leaving the unindexed portions as NaN, and finally computing my metrics with operations that ignore NaNs. I don't believe this is memory efficient and I assume that populating the array is quite slow.
Example code below:
polar_points = get_polar(points) # convert points to polar coordinates, sorted by angle
quantized = (polar_points[:,1] // bin_size) # quantize the points to a provided resolution
split_key = np.nonzero(np.diff(quantized))[0] + 1 # compute gap indices
max_size_key = np.append(np.insert(split_key, 0, 0), quantized.shape[0]) # add first and last index for size computation
split_polar = np.split(polar_points, split_key) # split original points at gap indices
dim_0 = len(split_polar) # get number of split clouds
dim_1 = max(np.diff(max_size_key)) # get size of largest split cloud
reshaped_array = np.full(shape=(dim_0, dim_1, 2), fill_value=np.nan) # init array for inhomogenous reshaped data
for idx, arr in enumerate(split_polar):
reshaped_array[idx, :arr.shape[0], :] = arr
if mode=='mean':
res = np.nanmean(reshaped_array, axis=1)
elif mode == 'median':
res = np.nanmedian(reshaped_array, axis=1)
elif mode == 'closest':
min_indices = np.nanargmin(reshaped_array[:, :, 0], axis=-1) # get idx of min r for each bin
res = reshaped_array[np.arange(dim_0), min_indices, :]
elif mode == 'furthest':
max_indices = np.nanargmax(reshaped_array[:, :, 0], axis=-1) # get idx of max r for each bin
res = reshaped_array[np.arange(dim_0), max_indices, :]
return get_cartesian(res)
I'm wondering if numpy.ufunc or numpy.vectorize could be used to solve this? I have seen map used to similar ends, but I'm not sure how efficient this would be compared to a full numpy solution.
Upvotes: 0
Views: 46