aph
aph

Reputation: 1855

reshaping an arbitrary collection numpy arrays

I have a relatively small number k of length N numpy arrays, where k is of order 10, and N is very large, of order 10^7. I am trying to create a single, two-dimensional N x k array that bundles this data in a specific way.

For definiteness, here is a specific example of what I am trying to do.

x = np.array([0,0,0,0])
y = np.array([1,1,1,1])
z = np.array([2,2,2,2])

The array I want at the end is:

p = np.array([[0,1,2], [0,1,2], [0,1,2], [0,1,2]])

Speed is a critical issue, so for-looping is unacceptably slow. I have not been able to figure out how to use np.reshape or np.concatenate to do this, but I know there must be some simple, single line of numpy syntax for this.

Upvotes: 0

Views: 591

Answers (4)

aph
aph

Reputation: 1855

Ok, thanks to everyone for the helpful answers. As I suspected, there are one-liners in numpy for exactly this purpose. Transpose, vstack, and column_stack are all vast speed improvements on what I was doing.

There are four proposed solutions, all of which return the correct arrays:

  • concatenate + transpose (David Z)

  • vstack (David Z)

  • pre-allocate + slice-wise assignment (David Z)

  • column_stack (ajcr)

The upshot take-away of the below is this: in all regimes, concatenate + transpose is the fastest algorithm.

I've done some simple explorations of how the solution scales with both k and N, looking scalings in both the k ~ N regime, and also the k << N regime. It's worth noting, though, that even for very large N, the runtimes are less than 1s, so only for MCMC-type applications should one every really fuss over this sort of thing.

Here's the summary of the tests I have run:

  • First, I can quantitatively confirm the speed tests run by David Z. Moreover, in the regime he explored, I find that the differences he finds are robust, and not simple fluctuations. See below.

  • In the k << N regime, I notice no appreciable difference between vstack and concatenate + transpose. In this regime, the timings they return are within 1-5% regardless of the values of k and N.

  • In the k << N regime, I notice no appreciable difference between column_stack and pre-allocate & slice-wise assignment. In this regime, the timings they return are within 1-5% regardless of the values of k and N.

  • In the k << N regime, vstack is 50-500% faster than column_stack. This fractional speed difference only slowly increases with N, but rapidly increases with k. The larger N, the more rapidly the rate of fractional increase with k.

  • In the k ~ N regime (not relevant for my problem, but possibly in others), concatenate + transpose is the fastest, with pre-allocate & slice-wise assignment trailing by 10-50%.

  • In the k ~ N regime, column_stack and vstack are roughly the same speed, which is 500-1000% times slower than concatenate + transpose.

So, as I said above, the upshot is that concatenate + transpose is the fastest in all regimes:

p = np.concatenate((x, y, z)).reshape((3, len(x))).T

However, in the regime that is relevant to the original question, the vstack method has equivalent performance and a little less rigamarole to the syntax:

p = numpy.vstack((x, y, z)).T

Upvotes: 0

David Z
David Z

Reputation: 131550

Here are a few methods you could try:

  • using vstack and transpose:

    p = numpy.vstack((x, y, z)).T
    
  • using concatenate and reshape

    p = numpy.concatenate((x, y, z)).reshape((3, len(x))).T
    
  • allocating a new array and using put

    p = numpy.empty((len(x), 3))
    for i, a in enumerate((x, y, z)):
        p[:,i] = a
    

Along with timing results computed in iPython, for len(x) == len(y) == len(z) == 1e7:

In [57]: %timeit p = numpy.vstack((x, y, z)).T
10 loops, best of 3: 117 ms per loop

In [58]: %timeit p = numpy.concatenate((x, y, z)).reshape((3, len(x))).T
10 loops, best of 3: 120 ms per loop

In [60]: %timeit p = numpy.column_stack((x, y, z))
10 loops, best of 3: 159 ms per loop

In [66]: %%timeit
   ....: p = numpy.empty((len(x), 3), order='C')
   ....: for i, a in enumerate((x, y, z)):
   ....:   p[:,i] = a
   ....: 
10 loops, best of 3: 147 ms per loop

In [67]: %%timeit
   ....: p = numpy.empty((len(x), 3), order='F')
   ....: for i, a in enumerate((x, y, z)):
   ....:   p[:,i] = a
   ....: 
10 loops, best of 3: 119 ms per loop

I've also included the method from ajcr's answer, and tried both row-major and column-major ordering in the last one. There seem to be roughly two sets of methods in terms of timing, the 120ms-like methods and the 150ms-like methods, and it's probably noteworthy that row-major order ('C') is one of the latter set whereas column-major order ('F') is one of the former set.

I suspect these values are not reliable enough to distinguish between the methods. I'd encourage you to do your own tests and see which is fastest.

Upvotes: 1

Alex Riley
Alex Riley

Reputation: 176750

You could use column_stack:

>>> np.column_stack([x, y, z])
array([[0, 1, 2],
       [0, 1, 2],
       [0, 1, 2],
       [0, 1, 2]])

Internally this makes the three arrays 2D (without making copies if possible), transposes them, and then concatenates thm. The concatenate function is an internal C function so it is likely to be efficient speed-wise.

Upvotes: 3

Zero
Zero

Reputation: 76917

You could use np.concatenate

In [117]: np.concatenate(([x], [y], [z]), axis=0).T
Out[117]:
array([[0, 1, 2],
       [0, 1, 2],
       [0, 1, 2],
       [0, 1, 2]])

Also, You could append the arrays iteratively and then transpose.

Note: This does 3 loops.

In [113]: arr = np.empty((0,4), int)

In [114]: for el in [x, y, z]:
   .....:     arr = np.append(arr, [el], axis=0)
   .....:

In [115]: arr
Out[115]:
array([[0, 0, 0, 0],
       [1, 1, 1, 1],
       [2, 2, 2, 2]])

In [116]: arr.T
Out[116]:
array([[0, 1, 2],
       [0, 1, 2],
       [0, 1, 2],
       [0, 1, 2]])

Upvotes: 1

Related Questions