michaelsnowden
michaelsnowden

Reputation: 6192

Random one-hot matrix in numpy

I want to make a matrix x with shape (n_samples, n_classes) where each x[i] is a random one-hot vector. Here's a slow implementation:

x = np.zeros((n_samples, n_classes))
J = np.random.choice(n_classes, n_samples)
for i, j in enumerate(J):
    x[i, j] = 1

What's a more pythonic way to do this?

Upvotes: 8

Views: 6151

Answers (2)

akuiper
akuiper

Reputation: 214927

For the assignment part, you can use advanced indexing:

# initialize data
n_samples = 3
n_classes = 5
x = np.zeros((n_samples, n_classes))
J = np.random.choice(n_classes, n_samples)

# assign with advanced indexing
x[np.arange(n_samples), J] = 1

x
#array([[ 0.,  1.,  0.,  0.,  0.],
#       [ 0.,  1.,  0.,  0.,  0.],
#       [ 1.,  0.,  0.,  0.,  0.]])

Or another option, use OneHotEncoder from sklearn:

n_samples = 3
n_classes = 5
J = np.random.choice(n_classes, n_samples)

from sklearn.preprocessing import OneHotEncoder
enc = OneHotEncoder(n_values=n_classes, sparse=False)
enc.fit_transform(J.reshape(-1,1))

#array([[ 1.,  0.,  0.,  0.,  0.],
#       [ 0.,  0.,  0.,  0.,  1.],
#       [ 0.,  1.,  0.,  0.,  0.]])

Upvotes: 4

cs95
cs95

Reputation: 402323

Create an identity matrix using np.eye:

x = np.eye(n_classes)

Then use np.random.choice to select rows at random:

x[np.random.choice(x.shape[0], size=n_samples)]

As a shorthand, just use:

np.eye(n_classes)[np.random.choice(n_classes, n_samples)]

Demo:

In [90]: np.eye(5)[np.random.choice(5, 100)]
Out[90]: 
array([[ 1.,  0.,  0.,  0.,  0.],
       [ 1.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  1.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  1.],
       [ 0.,  0.,  0.,  1.,  0.],
       [ 1.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  1.,  0.],
       .... (... to 100)

Upvotes: 21

Related Questions