Reputation: 3977
I have an array of number (like time series) and I want to certain number of high values. The high values (local maxima) should be far enough from each other. My solution:
The code:
import numpy as np
import matplotlib.pylab as plt
MIN_DIST = 4
TO_FIND = 2
np.random.seed(103)
x = np.random.normal(0, 1, 20)
x[2] = 4
x[3] = 4.1
x[13] = 3.9
plt.plot(x)
plt.show()
locs = []
for idx in range(TO_FIND):
loc = x.argmax()
x[max(0,loc-MIN_DIST):min(loc+MIN_DIST,len(x))] = -2
locs.append(loc)
print(locs)
Printed correct asnswer = 3, 13
In the example above there are two "too close values" - index 2 and 3 - so they should be counted only once as a maximum at index 3 (the bigger value). The second maxima I want to find is at index 13.
The code provided works well. However, I feel like it is really dumb way to do it. Is there any numpy or mathematical trick (even dirty tricky counts) on how to achieve it in a cheaper way?
Amateur comparison to scipy.signal find_peaks:
import numpy as np
import matplotlib.pylab as plt
import time
from scipy.signal import find_peaks
N = 10000
MIN_DIST = 4
TO_FIND = 2
t1 = 0
t2 = 0
correct = []
for k in range(N):
y = np.random.normal(0, 1, 10000)
y[3] = 5
y[4] = 5.1
y[11] = 4.9
# plt.plot(y)
# plt.show()
t0 = time.time()
peaks, _ = find_peaks(y, distance=MIN_DIST)
t1 += time.time() - t0
t0 = time.time()
x = y.copy()
locs = []
for idx in range(TO_FIND):
loc = x.argmax()
x[max(0,loc-MIN_DIST):min(loc+MIN_DIST,len(x))] = -2
locs.append(loc)
t2 += time.time() - t0
same_answers = all([a == b for a, b in zip(locs, peaks[:TO_FIND])])
correct.append(same_answers)
print("Correct (same answers):", all(correct))
print("find_peaks:", t1)
print("default:", t2)
find_peaks seems to be a bit slower:
Correct (same answers): True
find_peaks: 3.137294292449951
default: 0.8532450199127197
Also if I remove the "fake samples" and the maxima there are not so clear, the results are not the same.
Upvotes: 1
Views: 149
Reputation: 22031
I suggest you this solution from scipy:
from scipy.signal import find_peaks
np.random.seed(103)
x = np.random.normal(0, 1, 20)
x[2] = 4
x[3] = 4.1
x[13] = 3.9
MIN_DIST = 4
peaks, _ = find_peaks(x, distance=MIN_DIST, height=3)
peaks
Upvotes: 1