patrick
patrick

Reputation: 4852

Why does the numpy function `take` change the shape of my array?

I am trying to calculate which points in my data set (in the shape of a numpy array called "matrix") are closest to a vector (array called "vector") in ndimensional space. Then, I want to extract these same vectors from a data set which is identical to "matrix" but includes additional labels (="matrix_with_labels").

vector=([1,2,3,...])
matrix=[[1,2,3,...], [2,4,6,...], ...]]
matrix_with_labels=[[a,1,2,3,...], [b,2,4,6,...], ...]]

Thus, I compute the distances between the vector and each item in the matrix:

dist=scipy.spatial.distance.cdist(matrix,vector,'euclidean')

Then I sort these distances to identify the closest neighbors:

sorted_index=np.argsort(dist, axis=0)

Then I try to sort the "matrix_with_labels" by "sorted_index", using numpy.take as explained in this post on SO.

result= matrix_with_labels.take(sorted_index, 0)

The outcome looks just fine until I try to process it further - it seems to have changed shape:

print result.shape
(20, 1, 11)

When I look at the shape of the initial "matrix_with_labels", however:

matrix_with_labels.shape
(20, 11)

The documentation on take says:

subarray : ndarray The returned array has the same type as a.

What am I doing wrong? Any help is appreciated!

Upvotes: 0

Views: 92

Answers (1)

Matt Messersmith
Matt Messersmith

Reputation: 13747

If you're starting with a (20, 11) shape, I think the only way to get a (20, 1, 11) shape is if x has shape (1, 11).

Try result = matrix_with_labels.take(x.reshape(-1), 0).

Upvotes: 1

Related Questions