Reputation: 20889
Are there good ways to "expand" a numpy ndarray? Say I have an ndarray like this:
[[1 2]
[3 4]]
And I want each row to contains more elements by filling zeros:
[[1 2 0 0 0]
[3 4 0 0 0]]
I know there must be some brute-force ways to do so (say construct a bigger array with zeros then copy elements from old smaller arrays), just wondering are there pythonic ways to do so. Tried numpy.reshape
but didn't work:
import numpy as np
a = np.array([[1, 2], [3, 4]])
np.reshape(a, (2, 5))
Numpy complains that: ValueError: total size of new array must be unchanged
Upvotes: 63
Views: 145393
Reputation: 61259
You can use numpy.pad
, as follows:
>>> import numpy as np
>>> a=[[1,2],[3,4]]
>>> np.pad(a, ((0,0),(0,3)), mode='constant', constant_values=0)
array([[1, 2, 0, 0, 0],
[3, 4, 0, 0, 0]])
Here np.pad
says, "Take the array a
and add 0 rows above it, 0 rows below it, 0 columns to the left of it, and 3 columns to the right of it. Fill these columns with a constant
specified by constant_values
".
Upvotes: 83
Reputation: 13718
# what you want to expand
x = np.ones((3, 3))
# expand to what shape
target = np.zeros((6, 6))
# do expand
target[:x.shape[0], :x.shape[1]] = x
# print target
array([[ 1., 1., 1., 0., 0., 0.],
[ 1., 1., 1., 0., 0., 0.],
[ 1., 1., 1., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0.]])
borrow from https://stackoverflow.com/a/35751427/1637673, with a little modification.
def pad(array, reference_shape, offsets=None):
"""
array: Array to be padded
reference_shape: tuple of size of narray to create
offsets: list of offsets (number of elements must be equal to the dimension of the array)
will throw a ValueError if offsets is too big and the reference_shape cannot handle the offsets
"""
if not offsets:
offsets = np.zeros(array.ndim, dtype=np.int32)
# Create an array of zeros with the reference shape
result = np.zeros(reference_shape, dtype=np.float32)
# Create a list of slices from offset to offset + shape in each dimension
insertHere = [slice(offsets[dim], offsets[dim] + array.shape[dim]) for dim in range(array.ndim)]
# Insert the array in the result at the specified offsets
result[insertHere] = array
return result
Upvotes: 12
Reputation: 2710
there are also similar methods like np.vstack, np.hstack, np.dstack. I like these over np.concatente as it makes it clear what dimension is being "expanded".
temp = np.array([[1, 2], [3, 4]])
np.hstack((temp, np.zeros((2,3))))
it's easy to remember becase numpy's first axis is vertical so vstack expands the first axis and 2nd axis is horizontal so hstack.
Upvotes: 6
Reputation: 20339
Just to be clear: there's no "good" way to extend a NumPy array, as NumPy arrays are not expandable. Once the array is defined, the space it occupies in memory, a combination of the number of its elements and the size of each element, is fixed and cannot be changed. The only thing you can do is to create a new array and replace some of its elements by the elements of the original array.
A lot of functions are available for convenience (the np.concatenate
function and its np.*stack
shortcuts, the np.column_stack
, the indexes routines np.r_
and np.c_
...), but there are just that: convenience functions. Some of them are optimized at the C level (the np.concatenate
and others, I think), some are not.
Note that there's nothing at all with your initial suggestion of creating a large array 'by hand' (possibly filled with zeros) and filling it yourself with your initial array. It might be more readable that more complicated solutions.
Upvotes: 19
Reputation: 80346
You should use np.column_stack
or append
import numpy as np
p = np.array([ [1,2] , [3,4] ])
p = np.column_stack( [ p , [ 0 , 0 ],[0,0] ] )
p
Out[277]:
array([[1, 2, 0, 0],
[3, 4, 0, 0]])
Append seems to be faster though:
timeit np.column_stack( [ p , [ 0 , 0 ],[0,0] ] )
10000 loops, best of 3: 61.8 us per loop
timeit np.append(p, [[0,0],[0,0]],1)
10000 loops, best of 3: 48 us per loop
And a comparison with np.c_
and np.hstack
[append still seems to be the fastest]:
In [295]: z=np.zeros((2, 2), dtype=a.dtype)
In [296]: timeit np.c_[a, z]
10000 loops, best of 3: 47.2 us per loop
In [297]: timeit np.append(p, z,1)
100000 loops, best of 3: 13.1 us per loop
In [305]: timeit np.hstack((p,z))
10000 loops, best of 3: 20.8 us per loop
and np.concatenate
[that is a even a bit faster than append
]:
In [307]: timeit np.concatenate((p, z), axis=1)
100000 loops, best of 3: 11.6 us per loop
Upvotes: 9
Reputation: 362547
There are the index tricks r_
and c_
.
>>> import numpy as np
>>> a = np.array([[1, 2], [3, 4]])
>>> z = np.zeros((2, 3), dtype=a.dtype)
>>> np.c_[a, z]
array([[1, 2, 0, 0, 0],
[3, 4, 0, 0, 0]])
If this is performance critical code, you might prefer to use the equivalent np.concatenate
rather than the index tricks.
>>> np.concatenate((a,z), axis=1)
array([[1, 2, 0, 0, 0],
[3, 4, 0, 0, 0]])
There are also np.resize
and np.ndarray.resize
, but they have some limitations (due to the way numpy lays out data in memory) so read the docstring on those ones. You will probably find that simply concatenating is better.
By the way, when I've needed to do this I usually just do it the basic way you've already mentioned (create an array of zeros and assign the smaller array inside it), I don't see anything wrong with that!
Upvotes: 57