Mr. Frobenius
Mr. Frobenius

Reputation: 324

Python - Reversing data generated with particular loop order

Question: How could I peform the following task more efficiently?

My problem is as follows. I have a (large) 3D data set of points in real physical space (x,y,z). It has been generated by a nested for loop that looks like this:

# Generate given dat with its ordering
x_samples = 2
y_samples = 3
z_samples = 4
given_dat = np.zeros(((x_samples*y_samples*z_samples),3))
row_ind = 0
for z in range(z_samples):
    for y in range(y_samples):
        for x in range(x_samples):
            row = [x+.1,y+.2,z+.3]
            given_dat[row_ind,:] = row
            row_ind += 1
for row in given_dat:
    print(row)`

For the sake of comparing it to another set of data, I want to reorder the given data into my desired order as follows (unorthodox, I know):

# Generate data with desired ordering
x_samples = 2
y_samples = 3
z_samples = 4
desired_dat = np.zeros(((x_samples*y_samples*z_samples),3))
row_ind = 0
for z in range(z_samples):
    for x in range(x_samples):
        for y in range(y_samples):
            row = [x+.1,y+.2,z+.3]
            desired_dat[row_ind,:] = row
            row_ind += 1
for row in desired_dat:
    print(row)

I have written a function that does what I want, but it is horribly slow and inefficient:

def bad_method(x_samp,y_samp,z_samp,data):
    zs = np.unique(data[:,2])
    xs = np.unique(data[:,0])
    rowlist = []
    for z in zs:
        for x in xs:
            for row in data:
                if row[0] == x and row[2] == z:
                rowlist.append(row)
    new_data = np.vstack(rowlist)
    return new_data
# Shows that my function does with I want
fix = bad_method(x_samples,y_samples,z_samples,given_dat)    
print('Unreversed data')
print(given_dat)
print('Reversed Data')
print(fix)
# If it didn't work this will throw an exception
assert(np.array_equal(desired_dat,fix))

How could I improve my function so it is faster? My data sets usually have roughly 2 million rows. It must be possible to do this with some clever slicing/indexing which I'm sure will be faster but I'm having a hard time figuring out how. Thanks for any help!

Upvotes: 0

Views: 59

Answers (1)

xnx
xnx

Reputation: 25548

You could reshape your array, swap the axes as necessary and reshape back again:

# (No need to copy if you don't want to keep the given_dat ordering)
data = np.copy(given_dat).reshape(( z_samples, y_samples, x_samples, 3))
# swap the "y" and "x" axes
data = np.swapaxes(data, 1,2)
# back to 2-D array
data = data.reshape((x_samples*y_samples*z_samples,3))

assert(np.array_equal(desired_dat,data))

Upvotes: 2

Related Questions