Reputation: 324
My problem is as follows. I have a (large) 3D data set of points in real physical space (x,y,z). It has been generated by a nested for loop that looks like this:
# Generate given dat with its ordering
x_samples = 2
y_samples = 3
z_samples = 4
given_dat = np.zeros(((x_samples*y_samples*z_samples),3))
row_ind = 0
for z in range(z_samples):
for y in range(y_samples):
for x in range(x_samples):
row = [x+.1,y+.2,z+.3]
given_dat[row_ind,:] = row
row_ind += 1
for row in given_dat:
print(row)`
For the sake of comparing it to another set of data, I want to reorder the given data into my desired order as follows (unorthodox, I know):
# Generate data with desired ordering
x_samples = 2
y_samples = 3
z_samples = 4
desired_dat = np.zeros(((x_samples*y_samples*z_samples),3))
row_ind = 0
for z in range(z_samples):
for x in range(x_samples):
for y in range(y_samples):
row = [x+.1,y+.2,z+.3]
desired_dat[row_ind,:] = row
row_ind += 1
for row in desired_dat:
print(row)
I have written a function that does what I want, but it is horribly slow and inefficient:
def bad_method(x_samp,y_samp,z_samp,data):
zs = np.unique(data[:,2])
xs = np.unique(data[:,0])
rowlist = []
for z in zs:
for x in xs:
for row in data:
if row[0] == x and row[2] == z:
rowlist.append(row)
new_data = np.vstack(rowlist)
return new_data
# Shows that my function does with I want
fix = bad_method(x_samples,y_samples,z_samples,given_dat)
print('Unreversed data')
print(given_dat)
print('Reversed Data')
print(fix)
# If it didn't work this will throw an exception
assert(np.array_equal(desired_dat,fix))
How could I improve my function so it is faster? My data sets usually have roughly 2 million rows. It must be possible to do this with some clever slicing/indexing which I'm sure will be faster but I'm having a hard time figuring out how. Thanks for any help!
Upvotes: 0
Views: 59
Reputation: 25548
You could reshape your array, swap the axes as necessary and reshape back again:
# (No need to copy if you don't want to keep the given_dat ordering)
data = np.copy(given_dat).reshape(( z_samples, y_samples, x_samples, 3))
# swap the "y" and "x" axes
data = np.swapaxes(data, 1,2)
# back to 2-D array
data = data.reshape((x_samples*y_samples*z_samples,3))
assert(np.array_equal(desired_dat,data))
Upvotes: 2