Reputation: 557
An example of my numpy array scores
and its m
number of nearest neighbors;
scores = np.random.normal(-0.2,0.01,1000)
m = np.int(np.sqrt(scores.shape[0])+0.5)
I want to compare the i
th value in scores
with its m
nearest neighbors (index-wise). The comparison should be done by something similar to
x[i] = (scores[i]-np.mean(scores[m])) / np.sum(scores[m])
, where np.mean(scores[m]),np.sum[scores[m]]
represents the mean and sum of the m
nearest neighbors of scores
. If it can handle the first and last m
indices, that's a bonus. With x
as a numpy array I should be able to use something similar to
scores[x > threshold]
to get all scores that exceeds a certain threshold. The idea is to call scores[i]
an outlier if it exceeds this particular threshold.
Upvotes: 1
Views: 50
Reputation: 32144
You may solve it using scipy.ndimage.uniform_filter:
uniform_filter
is equivalent to Box blur (1 dimensional box filter in your case).
Here is the code:
import numpy as np
import scipy.ndimage
#scores = np.random.normal(-0.2,0.01,1000)
scores = np.array(np.r_[1:30]).astype(float) # Initialize to values 1 to 30 (for testing)
m = np.int(np.sqrt(scores.shape[0])+0.5) # m = 5
mean_scores = scipy.ndimage.uniform_filter(scores, size=m, mode='reflect', cval=0.0)
sum_scores = mean_scores * m
x = (scores - mean_scores / sum_scores)
Upvotes: 1