Chris
Chris

Reputation: 31206

Numpy array aggregation in slice/vectorized notation

Using np, how to create a new numpy array that is a combination of 2 numpy arrays?

Here is the problem:

x = [[a1,a2],[b1,b2],...] # this is an ndarray
y = [a,b,c,...] # ditto

xnew = [[a1,a2,a],...]

or xnew = [([a1,a2],a), ...]

Here is how I would solve it using lists and for loops:

xnew = [(x[i],y[i]) for i in range(len(x))]

How do I do the same thing using numpy?

Upvotes: 0

Views: 82

Answers (1)

hpaulj
hpaulj

Reputation: 231385

This is straight forward case of concatenation - except that y needs to be transposed:

In [246]: x = np.array([[1,2],[3,4]])
In [247]: y= np.array([[5,6]])
In [248]: np.concatenate((x,y.T),axis=1)
Out[248]: 
array([[1, 2, 5],
       [3, 4, 6]])

That is, in one way or other y has to have as many rows as x. column_stack and hstack require the same transpose.

In numpy, tuple notation is used for structured array records. That requires defining a compound dtype. If you outline the desired dtype, I can help you construct it.

You comment:

Y can be an arbitrary length list, as can X, so I need to keep them separate..

Does that mean there can be different number of items in Y and X, and that some of those tuples will in complete? Having a x term but not y, or v.v.? If that's the case, then you'd be test of using list comprehension and one of the zip tools (regular zip or one from itertools). numpy arrays are for lists/arrays that match in size.

zip examples:

In [263]: x = [[1,2],[3,4]]
In [264]: y= [5,6,7]         # not nested

zip over the shortest, ignore the longest

In [266]: [(i,j) for i,j in zip(x,y)]
Out[266]: [([1, 2], 5), ([3, 4], 6)]

zip over the longest, pad the shortest

In [267]: [(i,j) for i,j in itertools.zip_longest(x,y)]
Out[267]: [([1, 2], 5), ([3, 4], 6), (None, 7)]

Upvotes: 3

Related Questions