Reputation: 6192
I want to make a matrix x
with shape (n_samples, n_classes)
where each x[i]
is a random one-hot vector. Here's a slow implementation:
x = np.zeros((n_samples, n_classes))
J = np.random.choice(n_classes, n_samples)
for i, j in enumerate(J):
x[i, j] = 1
What's a more pythonic way to do this?
Upvotes: 8
Views: 6151
Reputation: 214927
For the assignment part, you can use advanced indexing:
# initialize data
n_samples = 3
n_classes = 5
x = np.zeros((n_samples, n_classes))
J = np.random.choice(n_classes, n_samples)
# assign with advanced indexing
x[np.arange(n_samples), J] = 1
x
#array([[ 0., 1., 0., 0., 0.],
# [ 0., 1., 0., 0., 0.],
# [ 1., 0., 0., 0., 0.]])
Or another option, use OneHotEncoder
from sklearn
:
n_samples = 3
n_classes = 5
J = np.random.choice(n_classes, n_samples)
from sklearn.preprocessing import OneHotEncoder
enc = OneHotEncoder(n_values=n_classes, sparse=False)
enc.fit_transform(J.reshape(-1,1))
#array([[ 1., 0., 0., 0., 0.],
# [ 0., 0., 0., 0., 1.],
# [ 0., 1., 0., 0., 0.]])
Upvotes: 4
Reputation: 402323
Create an identity matrix using np.eye
:
x = np.eye(n_classes)
Then use np.random.choice
to select rows at random:
x[np.random.choice(x.shape[0], size=n_samples)]
As a shorthand, just use:
np.eye(n_classes)[np.random.choice(n_classes, n_samples)]
Demo:
In [90]: np.eye(5)[np.random.choice(5, 100)]
Out[90]:
array([[ 1., 0., 0., 0., 0.],
[ 1., 0., 0., 0., 0.],
[ 0., 0., 1., 0., 0.],
[ 0., 0., 0., 0., 1.],
[ 0., 0., 0., 1., 0.],
[ 1., 0., 0., 0., 0.],
[ 0., 0., 0., 1., 0.],
.... (... to 100)
Upvotes: 21