Fast way to apply function to each row of a numpy array

Question

Suppose I have some nearest neighbor classifier. For a new observation it computes the distance between the new observation and all observations in the "known" data set. It returns the class label of the observation, that has the smallest distance to the new observation.

import numpy as np

known_obs = np.random.randint(0, 10, 40).reshape(8, 5)
new_obs = np.random.randint(0, 10, 80).reshape(16, 5)
labels = np.random.randint(0, 2, 8).reshape(8, )

def my_dist(x1, known_obs, axis=0):
    return (np.square(np.linalg.norm(x1 - known_obs, axis=axis)))

def nn_classifier(n, known_obs, labels, axis=1, distance=my_dist):
    return labels[np.argmin(distance(n, known_obs, axis=axis))]

def classify_batch(new_obs, known_obs, labels, classifier=nn_classifier, distance=my_dist):
    return [classifier(n, known_obs, labels, distance=distance) for n in new_obs]

print(classify_batch(new_obs, known_obs, labels, nn_classifier, my_dist))

For performance reasons I would like to avoid the for loop in the classify_batch function. Is there a way to use numpy operations to apply the nn_classifier function to each row of new_obs? I already tried apply_along_axis but as often mentioned it is convenient but not fast.

Fast way to apply function to each row of a numpy array

Answers (1)

Related Questions