Reputation: 1
I have a list of 8-letter sequences like this:
['GQPLWLEH', 'TLYSFFPK', 'TYGEIFEK', 'APYWLINK', ...]
How can I use regular expressions to find all the sequences that have the specific letters at specific positions within the sequence? For example, the letters V, I, F, or Y at the 2nd letter in the sequence and the letters M, L, F, Y at the 3rd position in the sequence.
I'm really new to RE, thanks in advance!
Upvotes: 0
Views: 85
Reputation: 2515
Maybe you can avoid using a regexp altogether:
[x for x in mylist if x[1] in 'VIFY' and x[2] in 'MLFY']
Upvotes: 0
Reputation: 2154
\b.[VIFY][MLFY]\w*\b
This may satisfy what you want. You can play with regex online at regex101
Upvotes: 0
Reputation: 520918
You can try using the following regex pattern:
.[VIFY][MLFY].*
This will match any first character, followed by a second and third character using the logic you want.
import re
mylist = ['GQPLWLEH', 'TLYSFFPK', 'TYGEIFEK', 'APYWLINK']
r = re.compile(".[VIFY][MLFY].*")
newlist = filter(r.match, mylist)
print str(newlist)
Demo here:
Note: I added the word BILL
to your list in the demo to get something which passes the regex match.
Upvotes: 1