Nathan majicvr.com
Nathan majicvr.com

Reputation: 1031

How to append numpy array to numpy array of different size?

I have 2 arrays to concatenate:

X_train's shape is (3072, 50000) y_train's shape is (50000,)

I'd like to concatenate them so I can shuffle the indices all in one go. I have tried the following, but neither works:

np.concatenate([X_train, np.transpose(y_train)])
np.column_stack([X_train, np.transpose(y_train)])

How can I concatenate them?

Upvotes: 0

Views: 586

Answers (2)

sascha
sascha

Reputation: 33522

To give you some recommendation targeting the task, not your problem: don't do this!

Assuming X are your samples / observations, y are your targets:

Just generate a random-permutation and create views (nothing copied or modified) into those, e.g. (untested):

import numpy as np

X = np.random.random(size=(50000, 3072))
y = np.random.random(size=50000)

perm = np.random.permutation(X.shape[0])  # assuming X.shape[0] == y.shape[0]
X_perm = X[perm]  # views!!!
y_perm = y[perm]

Reminder: your start-shapes are not compatible to most python-based ml-tools as the usual interpretation is:

  • first-dim / rows: samples
  • second-dim / cols: features

As #samples need to be the same as #target-values y, you will see that my example is correct in regards to this, while yours need a transpose on X

Upvotes: 2

Nathan majicvr.com
Nathan majicvr.com

Reputation: 1031

As DavidG said, I realized the answer is that y_train has shape (50000,) so I needed to reshape it before concat-ing

np.concatenate([X_train,         
     np.reshape(y_train, (1, 50000))])

Still, this evaluated very slowly in Jupyter. If there's a faster answer, I'd be grateful to have it

Upvotes: 0

Related Questions