Reputation: 91
I am trying to implement K-means by selective centroid selection. I have two numpy arrays, one called "features" which has a set of numpy arrays where each array is a datapoint and another np array called "labels", which has the label of class the data point at an index "i" belongs to. I have datapoints related to 4 different classes. What I want to do is to make use of both these numpy arrays, and randomly pick a datapoint one from each class. Could you please help me out with this. Also, is there any way to zip two numpy arrays into a dictionary?
for example I have the features array as :
[[1,1,1],[1,2,3],[1,6,7],[1,4,6],[1,6,9],[1,4,2]]
and my labels array is
[1,2,2,3,1,3]
For each value unique in the labels numpy array, I want one randomly chosen corresponding element in the features array. A sample answer would be :
[1,1,1] from class 1
[1,6,7] from class 2
[1,4,2] from class 3
Upvotes: 0
Views: 232
Reputation: 51155
You can accomplish this with a bit of indexing and numpy.unique
u = np.unique(labels)
f = np.arange(features.shape[0])
idx = np.random.choice(
f, u.shape[0], replace=False
)
dict(zip(u, features[idx]))
{1: array([1, 4, 2]), 2: array([1, 6, 9]), 3: array([1, 1, 1])}
Upvotes: 0
Reputation: 8923
Try:
import numpy as np
features = np.array([[1,1,1],[1,2,3],[1,6,7],[1,4,6],[1,6,9],[1,4,2]])
labels = np.array([1,2,2,3,1,3])
res = {i: features[np.random.choice(np.where(labels == i)[0])] for i in set(labels)}
output
{1: array([1, 1, 1]), 2: array([1, 2, 3]), 3: array([1, 4, 2])}
Upvotes: 0
Reputation: 84
Given this is the setup in your question:
import numpy as np
features = [[1,1,1],[1,2,3],[1,6,7],[1,4,6],[1,6,9],[1,4,2]]
labels = np.array([1,2,2,3,1,3])
This should get you a random variable from each label in dictionary form:
features_index = np.array(range(0, len(features)))
unique_labels = np.unique(labels)
rand = []
for n in unique_labels:
rand.append(features[np.random.choice(features_index[labels == n])])
dict(zip(unique_labels, rand))
Upvotes: 1