subhacom
subhacom

Reputation: 919

Python re: negating part of a regular expression

Perhaps a silly question, but though google returned lots of similar cases, I could not find this exact situation: what regular expression will match all string NOT containing a particular string. For example, I want to match any string that does not contain 'foo_'. Now,

 re.match('(?<!foo_).*', 'foo_bar') 

returns a match. While

re.match('(?<!foo_)bar', 'foo_bar')

does not. I tried the non-greedy version:

 re.match('(?<!foo_).*?', 'foo_bar')

still returns a match. If I add more characters after the ),

re.search('(?<!foo_)b.*', 'foo_bar')

it returns None, but if the target string has more trailing chars:

re.search('(?<!foo_)b.*', 'foo_barbaric')

it returns a match. I intentionally kept out the initial .* or .*? in the re. But same thing happens with that.

Any ideas why this strange behaviour? (I need this as a single regular expression - to be entered as a user input).

Upvotes: 1

Views: 8401

Answers (3)

Ahmad Mageed
Ahmad Mageed

Reputation: 96477

Try this pattern instead:

^(?!.*foo_).*

This uses the ^ metacharacter to match from the beginning of the string, and then uses a negative look-ahead that checks for "foo_". If it exists, the match will fail.

Since you gave examples using both re.match() and re.search(), the above pattern would work with both approaches. However, when you're using re.match() you can safely omit the usage of the ^ metacharacter since it will match at the beginning of the string, unlike re.search() which matches anywhere in the string.

Upvotes: 3

Tim Pietzcker
Tim Pietzcker

Reputation: 336128

You're using lookbehind assertions where you need lookahead assertions:

re.match(r"(?!.*foo_).*", "foo_bar")

would work (i. e. not match).

(?!.*foo_) means "Assert that it is impossible to match .*foo_ from the current position in the string. Since you're using re.match(), that position is automatically defined as the start of the string.

Upvotes: 4

Ryan McDermott
Ryan McDermott

Reputation: 571

I feel like there is a good chance that you could just design around this with a conditional statement.

(It would be nice if we knew specifically what you're trying to accomplish).

Why not:

if not re.match("foo", something):
    do_something
else:
    print "SKipping this"

Upvotes: 0

Related Questions