Reputation: 3360
I have these two strings:
s2 = start bla bla bla word1
bla bla bla word1
value1 word1
bla bla bla
s1= start bla bla bla word1
bla bla bla word1
bla bla bla word1
value1
I want to check if s
has the value1, but it should be after the the secondword1
.
so s1 should return the value1, but s2 should return None because value1 come after the third word1.
I tried this:
re.search('start(.*?word1){2}\s+(value1)')
the problem that my search return value1 for the s1 and s2 ? because .
include every thing !
Upvotes: 0
Views: 80
Reputation: 174836
Use a negative lookahead assertion like below. The below regex would capture the string value1
only if it's preceded by exactly two word1
strings from the start .
r'start(?:(?:(?!word1).)*word1){2}(?:(?!word1).)*?(value1)'
(?:(?!word1).)*
would match any character but not of word1
zero or more times. That is before matching a single character, regex engine would check for the character is not w
followed by ord1
. If there isn't, then only the regex engine would match the following character. This check would happen before matching each and every character. It would stops matching once it saw word1
string.
>>> import re
>>> s2 = "start bla bla bla word1 bla bla bla word1 value1 word1 bla bla bla"
>>> s1= "start bla bla bla word1 bla bla bla word1 bla bla bla word1 value1"
>>> re.search(r'start(?:(?:(?!word1).)*word1){2}(?:(?!word1).)*?(value1)', s2)
<_sre.SRE_Match object at 0x7f0bb60e9558>
>>> re.search(r'start(?:(?:(?!word1).)*word1){2}(?:(?!word1).)*?(value1)', s2).group(1)
'value1'
>>> re.search(r'start(?:(?:(?!word1).)*word1){2}(?:(?!word1).)*?(value1)', s1)
>>>
Upvotes: 2
Reputation: 107347
You can use the following function , that use re.findall
and positive look-behind in regex :
>>> def find(val,s):
... if re.findall(r'(?<=word1 )\w+',s)[1]==val:
... return val
... else :
... return None
...
>>> print find('value1',s1)
None
>>> print find('value1',s2)
value1
Upvotes: 0