Peter
Peter

Reputation: 1128

Match all regex conditions in any order

I have a webpage that I want to scrape using regex. The page may contain up to 3 text blocks that I care about.

If all three text blocks exist, then it should return a match, otherwise return no match. The text can be in any order on the page.

I tried this, but it doesn't satisfy the "any order" requirement:

re_text = (Text block 1)((.|\n)*)(Text block 2)((.|\n)*)(Text block 3)
re_compiled = re.compile(re_text)

Should I use backreferences here? Or is there another solution?

Upvotes: 0

Views: 172

Answers (2)

pogo
pogo

Reputation: 1550

>>> ('a' and 'b' and 'c') in 'xyz'
False
>>> ('a' and 'b' and 'c') in 'ayz'
True
>>> ('a' and 'b' and 'c') in 'abc'
True

Upvotes: -1

nneonneo
nneonneo

Reputation: 179392

How about just looking for them individually?

re_texts = [re.compile('textblock1'), re.compile('textblock2'), re.compile('textblock3')]

if all(r.search(text) for r in re_texts):
    # all matches found

Upvotes: 3

Related Questions