Reputation: 3455
I begin to learn re
module. First I'll show the original code:
import re
cheesetext = u'''<tag>I love cheese.</tag>
<tag>Yeah, cheese is all I need.</tag>
<tag>But let me explain one thing.</tag>
<tag>Cheese is REALLY I need.</tag>
<tag>And the last thing I'd like to say...</tag>
<tag>Everyone can like cheese.</tag>
<tag>It's a question of the time, I think.</tag>'''
def action1(source):
regex = u'<tag>(.*?)</tag>'
pattern = re.compile(regex, re.UNICODE | re.DOTALL | re.IGNORECASE)
result = pattern.findall(source)
return(result)
def action2(match, source):
pattern = re.compile(match, re.UNICODE | re.DOTALL | re.IGNORECASE)
result = bool(pattern.findall(source))
return(result)
result = action1(cheesetext)
result = [item for item in result if action2(u'cheese', item)]
print result
>>> [u'I love cheese.', u'Yeah, cheese is all I need.', u'Cheese is REALLY I need.', u'Everyone can like cheese.']
And now what I need. I need to do the same thing using one regex. It was an example, I have to process much more information than these cheesy texts. :-) Is it possible to combine these two actions in one regex? So the question is: how can I use conditions in regex?
Upvotes: 0
Views: 1952
Reputation: 20270
>>> p = u'<tag>((?:(?!</tag>).)*cheese.*?)</tag>'
>>> patt = re.compile(p, re.UNICODE | re.DOTALL | re.IGNORECASE)
>>> patt.findall(cheesetext)
[u'I love cheese.', u'Yeah, cheese is all I need.', u'Cheese is REALLY I need.', u'Everyone can like cheese.']
This uses a negative-lookahead assertion. A good explanation of this is given by Tim Pietzcker in this question.
Upvotes: 2
Reputation: 694
I propose to use look foward to check you don't get a </tag>
inside
re.findall(r'<tag>((?:(?!</tag>).)*?cheese(?:(?!</tag>).)*?)</tag>', cheesetext)
Upvotes: 1
Reputation: 10717
You can use |
.
>>> import re
>>> m = re.compile("(Hello|Goodbye) World")
>>> m.match("Hello World")
<_sre.SRE_Match object at 0x01ECF960>
>>> m.match("Goodbye World")
<_sre.SRE_Match object at 0x01ECF9E0>
>>> m.match("foobar")
>>> m.match("Hello World").groups()
('Hello',)
In addition, if you need actual conditions, you can use conditionals on previously matched groups with (?=...)
, (?!...)
, (?P=name)
and friends. See Python's re module docs.
Upvotes: 1