Krab
Krab

Reputation: 6756

Perl - why this regexp matches?

perl -ne 'print if /^(?=.*?\bPavel\b)(?=.*?\bDavid\b)((?!Petr|Jan)).*$/'

input: Honza,David,Pavel,Marie,Adam

I think it shouldn't pass, but it does.

The first lookahead should 'consume' Honza,David,Pavel and the second lookahead should fail, because there is no David after Pavel, or?

Upvotes: 1

Views: 56

Answers (2)

ikegami
ikegami

Reputation: 385764

The first lookahead should 'consume' Honza,David,Pavel.

Not at all. It's called a zero-width positive lookahead because it consumes nothing. It does not advance the position at which the next atom must match, so it must match at position zero too.


(?!Petr|Jan) is not going to work as is, though. It's only checking if they're at the start of the string. You could use

/^(?=.*\bPavel\b)(?=.*\bDavid\b)(?!.*\b(?:Petr|Jan)\b)/x

which is basically a melding of

/\bPavel\b/ && /bDavid\b/ && !/\b(?:Petr|Jan)\b/

This approach only works because you are looking until the end of the string.

Upvotes: 4

Miller
Miller

Reputation: 35198

Independent lookaheads don't "consume" anything. That is equivalent to:

print if /^(?=.*?\bPavel\b)/ && /^(?=.*?\bDavid\b)/ && /^((?!Petr|Jan))/'

which can be simplified to just:

perl -ne 'print if /\bPavel\b/ && /\bDavid\b/ && /^((?!Petr|Jan))/'

Upvotes: 3

Related Questions