Reputation: 3414
A somewhat new to numpy user, trying to see a "how to in numpy" solution to an algorithm-in-numpy-syntax problem.
I hope it's OK to ask for some help here, as I'm stumped and would love some advice - I hope this isn't TLDR'ish !!
Background:
I'm trying to calculate "prominence" manually to remove peaks from peak-finding, and trying to see if there is a simple "numpy" way to do that, rather than for loops.
Before anyone says, I know scipy.find_peaks can remove peaks with specified prominence, but I was trying to figure out how to do it by-hand using numpy syntax, not for looping over indexes (I know C, I'm trying to learn numpy...):
Reason:
The reason is I need to change the existing scipy logic from "if prominence on either left or right side of a peak is OK (ie big enough), then the peak should remain" to "if either left or right prominence is too small, then delete the peak".
This is equivalent to changing the max(left_min, right_min)
code on line 252 of the scipy code to a min(left_min, right_min)
The scipy _peak_prominences
code to find prominence values uses for loops foward/back over the arrays - I was trying to learn if that is possibly in numpy notation/syntax (without for looping), but after a day's effort, couldn't figure out how to do that - so would love some numpy advice!!
The conceptual logic is simple: find peaks in a profile (1d array), then calculate the height of each peak over the left/right neighbouring valleys (ie the "prominence"), and then remove peaks whose left or right height is less than a certain threshold - ie they aren't "peaky" enough on both sides.
In the example above I want to remove the two center peaks, because the height (prominence) of both over the central "valley" is too small - even though the surrounding two outside valleys heights OK.
So, my numpy help question:
I know if I have a peaks
array that is indexes into a profile
array, I can use np.delete to remove "not peaky enough peaks", if I can calculate a left_height and right_height set of arrays for each peak:
np.delete(peaks, np.where( (left_height<10) | (right_height<10) )[0])
I know for specific values in a peaks array, I can find the "valley" between them by slicing between peak indexes, and finding the min profile value in the slice:
profile[peaks[0]:peaks[1]].min()
what I can't figure out is how to, numpy-wise, calculate the left_height and right_height arrays. Something like
left_height = profile[peaks] - profile[peaks[-1]:peaks].min()
this is essentially the logic of "use a peaks array, with current and previous position, to make a slice, then use that slice to get the profile values, then calculate the minimum over those profile values.
Is that kind of logic possible using just python-istic numpy, or does it need a manual for loop?
Upvotes: 1
Views: 754
Reputation: 231
That's an great way to learn! :)
The first version of peak_prominences
was actually added as a pure Python implementation which you can find here.
That implementation still loops over each peak once and is obviously slower than the Cython implementation but may still be what you are looking for. You'd essentially want to change how the contour height is calculated on the lines 403 and 404 as you suggested yourself already.
what I can't figure out is how to, numpy-wise, calculate the left_height and right_height arrays. Something like
left_height = profile[peaks] - profile[peaks[-1]:peaks].min()
You seem to have gotten quite close to to the solution in at line 382 to 400 where the left and right height are calculated (ignore the window related stuff earlier). Here is the crucial bit that finds the left and right height for a single peak
using your variable names:
# Positions where profile is larger than current peak height
greater_peak = np.where(profile > profile[peak])[0]
try:
# Nearest position to the left of peak with
# profile[left] > profile[peak]
left = greater_peak[greater_peak < peak].max()
except ValueError:
left = 0
try:
# Nearest position to right of peak with
# profile[right] > profile[peak]
right = greater_peak[greater_peak > peak].min()
except ValueError:
right = None
# Base indices to the left and right of peak in profile
left_height = profile[left:peak].argmin() + left
right_height = profile[peak:right].argmin() + peak
I hope this helps in your endeavor.
Disclaimer: I'm the author of that function.
Upvotes: 2