Double AA
Double AA

Reputation: 5969

Combining NumPy arrays

I have two 20x100x3 NumPy arrays which I want to combine into a 40 x 100 x 3 array, that is, just add more lines to the array. I am confused by which function I want: is it vstack, hstack, column_stack or maybe something else?

Upvotes: 9

Views: 59334

Answers (5)

wim
wim

Reputation: 362717

By the way, there is also r_:

>>> from scipy import *
>>> a = rand(20,100,3)
>>> b = rand(20,100,3)
>>> a.shape
(20, 100, 3)
>>> b.shape
(20, 100, 3)
>>> r_[a,b].shape
(40, 100, 3)
>>> (r_[a,b] == vstack([a,b])).all()
True

Upvotes: 3

Michel Samia
Michel Samia

Reputation: 4467

I tried a little benchmark between r_ and vstack and the result is very interesting:

import numpy as np

NCOLS = 10
NROWS = 2
NMATRICES = 10000

def mergeR(matrices):
    result = np.zeros([0, NCOLS])

    for m in matrices:
        result = np.r_[ result, m]

def mergeVstack(matrices):
    result = np.vstack(matrices)

def main():
    matrices = tuple( np.random.random([NROWS, NCOLS]) for i in xrange(NMATRICES) )
    mergeR(matrices)
    mergeVstack(matrices)

    return 0

if __name__ == '__main__':
    main()

Then I ran profiler:

python -m cProfile -s cumulative np_merge_benchmark.py

and the results:

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
...
     1    0.579    0.579    4.139    4.139 np_merge_benchmark.py:21(mergeR)
...
     1    0.000    0.000    0.054    0.054 np_merge_benchmark.py:27(mergeVstack)

So the vstack way is 77x faster!

Upvotes: 4

Ben Racine
Ben Racine

Reputation: 551

Might be worth mentioning that

    np.concatenate((a1, a2, ...), axis=0) 

is the general form and vstack and hstack are specific cases. I find it easiest to just know which dimension I want to stack over and provide that as the argument to np.concatenate.

Upvotes: 11

JoshAdel
JoshAdel

Reputation: 68682

One of the best ways of learning is experimenting, but I would say you want np.vstack although there are other ways of doing the same thing:

a = np.ones((20,100,3))
b = np.vstack((a,a)) 

print b.shape # (40,100,3)

or

b = np.concatenate((a,a),axis=0)

EDIT

Just as a note, on my machine for the sized arrays in the OP's question, I find that np.concatenate is about 2x faster than np.vstack

In [172]: a = np.random.normal(size=(20,100,3))

In [173]: c = np.random.normal(size=(20,100,3))

In [174]: %timeit b = np.concatenate((a,c),axis=0)
100000 loops, best of 3: 13.3 us per loop

In [175]: %timeit b = np.vstack((a,c))
10000 loops, best of 3: 26.1 us per loop

Upvotes: 15

Giltech
Giltech

Reputation: 596

I believe it's vstack you want

p=array_2
q=array_2
p=numpy.vstack([p,q])

Upvotes: 26

Related Questions