Reputation: 4946
I do an interpolation with scipy.interpolate.griddata in a 1000 x 1000 grid. When I have a point-cloud with 1,000 (x,y,z)-values, calculation only takes a few seconds. But now I have 1,000,000 values. So I created a loop to extract 1,000 values out of these 1,000,000 values, like this:
p = [...]
z = [...]
#p and z are my lists with 1,000,000 values
p_new = []
z_new = []
for i in range(1000000):
if condition:
#condition is True for about 1000 times
p_new.append(p[i])
z_new.append(z[i])
print 'loop finished'
points = np.array(p_new)
values = np.array(z_new)
grid_z1 = griddata(points, values, (grid_x, grid_y), method='cubic')
plt.imshow(grid_z1.T, origin='lower')
plt.show()
print len(p_new)
returns me 1000
, so my loop works as expected.
But after my loop finished, I canceled my program after 15minutes of waiting because nothing happened.
So finally my question is:
Why takes this calculation so long although in both cases (1000 values by default and 1000 values extracting them out of 1000000) I have the same number of values? In my output loop finished
I can see that loop only takes about 10sec, so it should have nothing to do with my loop =/
Upvotes: 3
Views: 2155
Reputation: 74154
I can't see anything unusual going on here - as far as I can tell the time taken to interpolate is roughly proportional to the number of elements in the point cloud.
Here's some test data:
def fake_data(n):
# xy coordinates for an n-by-n grid
grid = np.indices((n,n),dtype=np.float32).reshape(2,-1).T
# interpolated coordinates
xy_i = grid.copy()
# not monotonically increasing
np.random.shuffle(xy_i)
# values
z = np.random.rand(n**2)
# input coordinates
xy = grid.copy()
# not regularly gridded
xy += np.random.rand(*xy_i.shape)*0.25
# pick n random points to use
inc = np.random.choice(np.arange(n**2),(n,),replace=False)
xy = grid[inc,:]
z = z[inc]
return xy, z, xy_i
For all three methods a log-log plot of N vs time is roughly a straight line, with a slope of ~2, i.e. they all require O(N^2) time.
If, in your case, you see that the lines are not straight but deviate upwards for large values of N, that might indicate that you're having some other problem, such as running out of memory and hitting the swap.
Upvotes: 1