Reputation: 5969
I have two 20x100x3 NumPy arrays which I want to combine into a 40 x 100 x 3 array, that is, just add more lines to the array. I am confused by which function I want: is it vstack, hstack, column_stack or maybe something else?
Upvotes: 9
Views: 59334
Reputation: 362717
By the way, there is also r_
:
>>> from scipy import *
>>> a = rand(20,100,3)
>>> b = rand(20,100,3)
>>> a.shape
(20, 100, 3)
>>> b.shape
(20, 100, 3)
>>> r_[a,b].shape
(40, 100, 3)
>>> (r_[a,b] == vstack([a,b])).all()
True
Upvotes: 3
Reputation: 4467
I tried a little benchmark between r_ and vstack and the result is very interesting:
import numpy as np
NCOLS = 10
NROWS = 2
NMATRICES = 10000
def mergeR(matrices):
result = np.zeros([0, NCOLS])
for m in matrices:
result = np.r_[ result, m]
def mergeVstack(matrices):
result = np.vstack(matrices)
def main():
matrices = tuple( np.random.random([NROWS, NCOLS]) for i in xrange(NMATRICES) )
mergeR(matrices)
mergeVstack(matrices)
return 0
if __name__ == '__main__':
main()
Then I ran profiler:
python -m cProfile -s cumulative np_merge_benchmark.py
and the results:
ncalls tottime percall cumtime percall filename:lineno(function)
...
1 0.579 0.579 4.139 4.139 np_merge_benchmark.py:21(mergeR)
...
1 0.000 0.000 0.054 0.054 np_merge_benchmark.py:27(mergeVstack)
So the vstack way is 77x faster!
Upvotes: 4
Reputation: 551
Might be worth mentioning that
np.concatenate((a1, a2, ...), axis=0)
is the general form and vstack and hstack are specific cases. I find it easiest to just know which dimension I want to stack over and provide that as the argument to np.concatenate.
Upvotes: 11
Reputation: 68682
One of the best ways of learning is experimenting, but I would say you want np.vstack
although there are other ways of doing the same thing:
a = np.ones((20,100,3))
b = np.vstack((a,a))
print b.shape # (40,100,3)
or
b = np.concatenate((a,a),axis=0)
EDIT
Just as a note, on my machine for the sized arrays in the OP's question, I find that np.concatenate
is about 2x faster than np.vstack
In [172]: a = np.random.normal(size=(20,100,3))
In [173]: c = np.random.normal(size=(20,100,3))
In [174]: %timeit b = np.concatenate((a,c),axis=0)
100000 loops, best of 3: 13.3 us per loop
In [175]: %timeit b = np.vstack((a,c))
10000 loops, best of 3: 26.1 us per loop
Upvotes: 15
Reputation: 596
I believe it's vstack you want
p=array_2
q=array_2
p=numpy.vstack([p,q])
Upvotes: 26