psaibharadwaj
psaibharadwaj

Reputation: 61

vectorize loop for ckdtree search over two arrays

I have a csv file with latitude, longitude and elevation values at random places. I wanted to apply IDW interpolation to generate a regular grid. I used scipy.spatial.cKDTree for nearest neighbor search and find the elevation value at unknown points. The following code works fine when the output grid has dimensions (z < 1000 X1000). If the dimensions increase the code runs really slow. Please help me vectorize the for loop with out removing using cKDTree. Thank you.

## Inverse distance weighted function
def idw(p, dist, values):

    dist_pow = np.power(dist, 2)
    nominator = np.sum(values/dist_pow)
    denominator = np.sum(1/dist_pow)
    if denominator > 0:
        return nominator/denominator
    else:
        return none
## Reading the lat/lon and elevation values from file
lat = []
lon = []
ele = []

with open('VSKP_ground_dat.csv') as read:
    csvreader = csv.DictReader(read)
    for row in csvreader:
        lat.append(float(row['LAT']))
        lon.append(float(row['LON']))
        ele.append(float(row['ALT']))
xycoord = np.c_[lon,lat]
ele_arr = np.array(ele)

## ------------- Creating KDTree
point_tree = spatial.cKDTree(xycoord, leafsize=25)
## ------------- Creating empty grid matrix with np.zeros
xmin, xmax, ymin, ymax  = 81.903158, 83.352158, 17.25856, 18.40056
## --------- Defining resolution
xres, yres = 0.01, 0.01

x = np.arange(xmin, xmax, xres)
y = np.arange(ymin, ymax, yres)
z = np.zeros((x.shape[0], y.shape[0]), dtype=np.float16)


for i, val1 in enumerate(x):
    for j, val2 in enumerate(y):
        p = np.array([val1, val2])
        # points_idx = point_tree.query_ball_point(p, dist_2)
        distances, points_idx = point_tree.query(p, k=6, eps=0)
        ele_vals = ele_arr[points_idx]
        value = idw(p, distances, ele_vals)
        z[i,j] = value

Upvotes: 2

Views: 651

Answers (1)

Daniel F
Daniel F

Reputation: 14399

First, fix up your idw function to work over the last index:

def idw(dist, values, p = 2):
    out = np.empty(dist.shape[:-1])
    mask = np.isclose(dist, 0).any(-1)
    out[mask] = values[np.isclose(dist, 0)]                   # should be only one per point
    dist_pow = np.power(dist[~mask], -p)                      # division is costly, do it once
    nominator = np.sum(values[~mask] * dist_pow, axis = -1)  # over mask to prevent divide by zero
    denominator = np.sum(dist_pow, index = -1)
    out[~mask] = nominator / denominator
    return out

Then do the rest based on np.meshgrid output

x = np.arange(xmin, xmax, xres)                           # len i
y = np.arange(ymin, ymax, yres)                           # len j
xy = np.stack(np.meshgrid(x, y), axis = -1)               # shape(i, j, 2)
distances, points_idx = point_tree.query(xy, k=6, eps=0)  # shape (i, j, 6)
ele_vals = ele_arr[points_idx]                            # shape (i, j, 6)
z = idw(distances, ele_vals)                              # shape (i, j)

Upvotes: 2

Related Questions