david
david

Reputation: 3360

How to match every thing except some character ? regex

I have these two strings:

s2 = start bla bla bla word1 bla bla bla word1 value1 word1 bla bla bla

s1= start bla bla bla word1 bla bla bla word1 bla bla bla word1 value1

I want to check if s has the value1, but it should be after the the secondword1. so s1 should return the value1, but s2 should return None because value1 come after the third word1.

I tried this:

re.search('start(.*?word1){2}\s+(value1)')

the problem that my search return value1 for the s1 and s2 ? because . include every thing !

Upvotes: 0

Views: 80

Answers (2)

Avinash Raj
Avinash Raj

Reputation: 174836

Use a negative lookahead assertion like below. The below regex would capture the string value1 only if it's preceded by exactly two word1 strings from the start .

r'start(?:(?:(?!word1).)*word1){2}(?:(?!word1).)*?(value1)'

(?:(?!word1).)* would match any character but not of word1 zero or more times. That is before matching a single character, regex engine would check for the character is not w followed by ord1. If there isn't, then only the regex engine would match the following character. This check would happen before matching each and every character. It would stops matching once it saw word1 string.

DEMO

>>> import re
>>> s2 = "start bla bla bla word1 bla bla bla word1 value1 word1 bla bla bla"
>>> s1= "start bla bla bla word1 bla bla bla word1 bla bla bla word1 value1"
>>> re.search(r'start(?:(?:(?!word1).)*word1){2}(?:(?!word1).)*?(value1)', s2)
<_sre.SRE_Match object at 0x7f0bb60e9558>
>>> re.search(r'start(?:(?:(?!word1).)*word1){2}(?:(?!word1).)*?(value1)', s2).group(1)
'value1'
>>> re.search(r'start(?:(?:(?!word1).)*word1){2}(?:(?!word1).)*?(value1)', s1)
>>> 

Upvotes: 2

Kasravnd
Kasravnd

Reputation: 107347

You can use the following function , that use re.findall and positive look-behind in regex :

>>> def find(val,s):
...  if re.findall(r'(?<=word1 )\w+',s)[1]==val:
...    return val
...  else :
...    return None
... 
>>> print find('value1',s1)
None
>>> print find('value1',s2)
value1

Upvotes: 0

Related Questions