Reputation: 107
I have a string and an associated list of scores for each position in that string. How can I remove all elements of that string below a certain threshold?
I can create work arounds that use a list intermediary, and I have seen answers that tackle this problem one element at a time, but I haven't seen anything that can directly remove multiple elements by position. The main problem with doing this one at a time is that will shift the position of the remain elements.
>>> import string
>>> import numpy as np
>>> sequence = ''.join(np.random.choice(list(string.ascii_uppercase), 10))
>>> sequence
'BQJVESXZBW'
>>> scores = np.random.uniform(size=10)
>>> scores
[0.99023134, 0.21286886, 0.10760723, 0.50485956, 0.207736, 0.76909266, 0.62174588, 0.89416775, 0.60837875, 0.32754857]
>>> threshhold = 0.50
The output should delete the second, third, fifth, and tenth element, leaving 'BVSXZB'
Upvotes: 0
Views: 434
Reputation: 500327
Here is one way to do it:
In [11]: scores
Out[11]:
array([0.00397126, 0.88897497, 0.06103467, 0.27202612, 0.50436342,
0.09516024, 0.92886696, 0.24499752, 0.40425165, 0.90589889])
In [12]: ''.join(ch for (ch, score) in zip(sequence, scores) if score >= threshold)
Out[12]: 'BFHC'
Upvotes: 3