Operations on nearest indices numpy array

Question

An example of my numpy array scores and its m number of nearest neighbors;

scores = np.random.normal(-0.2,0.01,1000)
m = np.int(np.sqrt(scores.shape[0])+0.5)

I want to compare the ith value in scores with its m nearest neighbors (index-wise). The comparison should be done by something similar to

x[i] = (scores[i]-np.mean(scores[m])) / np.sum(scores[m])

, where np.mean(scores[m]),np.sum[scores[m]] represents the mean and sum of the m nearest neighbors of scores. If it can handle the first and last m indices, that's a bonus. With x as a numpy array I should be able to use something similar to

scores[x > threshold]

to get all scores that exceeds a certain threshold. The idea is to call scores[i] an outlier if it exceeds this particular threshold.

Rotem · Accepted Answer

You may solve it using scipy.ndimage.uniform_filter:

uniform_filter is equivalent to Box blur (1 dimensional box filter in your case).

Here is the code:

import numpy as np
import scipy.ndimage

#scores = np.random.normal(-0.2,0.01,1000)
scores = np.array(np.r_[1:30]).astype(float) # Initialize to values 1 to 30 (for testing)
m = np.int(np.sqrt(scores.shape[0])+0.5) # m = 5

mean_scores = scipy.ndimage.uniform_filter(scores, size=m, mode='reflect', cval=0.0)
sum_scores = mean_scores * m

x = (scores - mean_scores / sum_scores)

Operations on nearest indices numpy array

Answers (1)

Related Questions