NinjaGaiden
NinjaGaiden

Reputation: 3146

Most efficient way to combine and separate numpy array

I need to combine 2 numpy arrays (same lenght) into a large array and then decouple the array back to original. I know how to combine the array but not sure how to decouple it.

So, combining array

x=np.random.randint(5, size=(100000, 3))
y=np.random.randint(5, size=(100000, 1))
a=np.hstack((x,y))

now, I am not sure how to get x and y back again. I have tried

(_x,_y)=a.shape
_x=-_x
nx=a[:,0]
ny=a[:,_x:,]

And forwhatever reason, I am not getting my correct, x and y.

Is there a better way to do this?

Upvotes: 0

Views: 419

Answers (1)

ali_m
ali_m

Reputation: 74172

x.shape is (100000, 3) and y.shape is (100000, 1). np.hstack concatenates arrays along the second (column) dimension, so a.shape == (100000, 4). That means that x corresponds to the first 3 columns of a, and y corresponds to the last column.

You could separate them using slice indexing, like this:

x1 = a[:, :3]   # the first 3 columns of a
y1 = a[:, 3:]   # the remaining column of a

What might have confused you is that when you index with an integer you reduce the dimensionality of the returned array by 1. For example, you might have expected a[:, 3] (the fourth column of a, since Python indexing starts at 0) to be the same as y, but instead it is a (100000,) 1D array rather than a (100000, 1) 2D array like y.

To avoid this you can either use slice indexing like in my example above, or you can insert a new dimension of size 1 using np.newaxis:

y2 = a[:, 3, np.newaxis]

or by calling reshape on the output:

y2 = a[:, 3].reshape(-1, 1)

The -1 expands this dimension automatically to match the size of a.

Upvotes: 2

Related Questions