devoured elysium
devoured elysium

Reputation: 105197

Why is this negative lookbehind being considered a successful regex match?

I would expect that this simple text string would not be considered a match:

xyz@xyz:~$ echo xy | perl -n -e 'print if /y(?<!x)/'
xy

But strangely enough it is. I've tried it also @ https://regex101.com/ and the result is the same. It seems like it will match the y, which confuses me. If my understanding is correct, the above regex should only match ys that are not preceded by a x.

PS: I gave this a simplistic example. In general I would like to use negative lookbehinds to match against full strings and not necessarily single characters such as x.

I'm using the Perl regex flavour.

Thanks

Upvotes: 4

Views: 140

Answers (2)

dawg
dawg

Reputation: 104072

You have the position of the negative lookback assertion backwards.

It is zero width, so needs to be in front of the y as you have written it.

Given:

$ echo $'xy\nay\nyx' 
xy
ay
yx

The lookbehind /y(?<!x)/ matches lines with x in front or in back of y because y is behind the assertion (not x):

$ echo $'xy\nay\nyx' | perl -n -e 'print if /y(?<!x)/'
xy
ay
yx

Note that yx also matches since the assertion comes prior to x and is looking at y so all three lines are matches.

Vs what you are looking for:

$ echo $'xy\nay\nyx' | perl -n -e 'print if /(?<!x)y/'
ay
yx

Demo

Further explanation.

Or, you need to account for width of the y (or whatever the match is) if looking backwards after the y by including the y in the assertion:

$ echo $'xy\nay\nyx' | perl -n -e 'print if /y(?<!xy)/'
ay
yx

Upvotes: 5

ikegami
ikegami

Reputation: 386501

ab means a followed by b. As such, you are checking for a y at one position, and then you check that the next position isn't preceded by x. It isn't preceded by x —it's preceded by y— so the pattern matches.

You want

/(?<!x)y/

or

/(?:^|[^x])y/

Upvotes: 5

Related Questions