Reputation: 5636
I want to find all possible substrings inside a string with the following requirement: The substring starts with N, the next letter is anything but P, and the next letter is S or T
With the test string "NNSTL"
, I would like to get as results "NNS" and "NST"
Is this possible with Regex?
Upvotes: 5
Views: 2382
Reputation: 395843
You can do this with the re module:
import re
Here's a possible search string:
my_txt = 'NfT foo NxS bar baz NPT'
So we use the regular expression that first looks for an N, any character other than a P, and a character that is either an S or a T.
regex = 'N[^P][ST]'
and using re.findall
:
found = re.findall(regex, my_txt)
and found returns:
['NfT', 'NxS']
Upvotes: 2
Reputation: 4356
Try the following regex:
N[^P\W\d_][ST]
The first character is N, the next character is none of (^) P, a non-letter (\W), a digit (\d) or underscore (_). The last letter is either S or T. I'm assuming the second character must be a letter.
EDIT
The above regex will only match the first instance in the string "NNSTL"
because it will then start the next potential match at position 3: "TL"
. If you truly want both results at the same time use the following:
(?=(N[^P\W\d_][ST])).
The substring will be in group 1 instead of the whole pattern match which will only be the first character.
Upvotes: 4
Reputation: 180
Yes. The regex snippet is: "N[^P][ST]"
Plug it in to any regex module methods from here: http://docs.python.org/2/library/re.html
Explanation:
N
matches a literal "N".[^P]
is a set, where the caret ("^") denotes inverse (so, it matches anything not in the set.[ST]
is another set, where it matches either an "S" or a "T".Upvotes: 1