Reputation: 13
I read some posts here but they couldn't help me figure out my issue:
you can read the below regexp which is trying to match the a place with excluding a specific string"Profile Pictures".I wanted match all other cases if the beginning string of the expression is not "Profile Pictures", but it doesn't work:
re.compile(r"(?!Profile Pictures)</strong></a><div class=\"photoTextSubtitle fsm fwn fcg\">(\d+) photos</div>")
The matched numbers(\d+) are returned,but "Profile Pictures" is still counted as one of them. I tried different ways but none of them works.However, I still feel negative lookahead is the way to solve it. Any ideas? Thank you!
Upvotes: 1
Views: 156
Reputation: 18633
You're using (?!...
or a negative lookahead assertion which according to the python regex documentation
Matches if ... doesn’t match next. This is a negative lookahead assertion. For example, Isaac (?!Asimov) will match 'Isaac ' only if it’s not followed by 'Asimov'.
In this case what you want is (?<!...
which is a negative lookbehind assertion. This is because you're trying to avoid matching text that comes before the text you want to match, not after. From the regex docs:
Matches if the current position in the string is not preceded by a match for .... This is called a negative lookbehind assertion. Similar to positive lookbehind assertions, the contained pattern must only match strings of some fixed length. Patterns which start with negative lookbehind assertions may match at the beginning of the string being searched.
That'd give you a regex that looked like this instead:
re.compile(r"(?<!Profile Pictures)</strong></a><div class=\"photoTextSubtitle fsm fwn fcg\">(\d+) photos</div>")
Of course, it's difficult to test this without some examples from you.
Upvotes: 1