Reputation: 3483
My login.txt file contains the following entries:
abc def
abc 123
def abc
abc de
tha ewe
When I do a positive lookahead using perl, I get the following result:
> cat login.txt | perl -ne 'print if /(?)abc\s(?=def)/'
abc def
...when I use grep, I get the following result:
> cat login.txt | grep -P '(?<=abc)\s(?=def)'
abc def
Negative lookahed results as follows from perl...:
> cat login | perl -ne 'print if /(?)abc\s(?!def)/'
abc 123
def abc
abc de
...and the grep result:
> cat login.txt | grep -P '(?<=abc)\s(?!def)'
abc 123
abc de
perl matched the def abc
for the negative lookahead. but it shouldn't have matched def abc
, as I'm checking abc
then def
pattern; whereas grep returns the correct result.
Is something missing in my perl pattern?
Upvotes: 18
Views: 36739
Reputation: 174
perl -ne 'print if /(?)abc\s(?!def)/'
To begin, as fugi stated, the (?)
is an empty non-capturing group, and matches anything, so it does nothing.
Therefore as written, this regex matches the literal string abc
followed by a single [:space:OR:tab:OR:newline]
, not followed by the literal string def
.
Because \s
matches a newline character and you did not chomp the trailing newline characters as you processed each line, def abc
matches because (?)abc\s
in the regex matches abc[:newline:]
which is followed by $
(the end-of-line anchor, not def
).
The corrected regex (accounting for the redundant (?)
) would be:
perl -ne 'print if /(?<=abc)\s(?!def)/'
...which matches a single [:space:OR:tab:OR:newline]
which is preceded by abc
and not followed by def
.
This still will match def abc
, because once again, \s
matches the [:newline:]
, which is preceded by abc
and followed by $
(the end-of-line anchor, not def
).
Either chomp the [:newline:]
before evaluating the regex in Perl, or use the character class [ \t] (if you need to account for tab characters) instead of \s
:
perl -ne 'print if /(?<=abc)[ \t](?!def)/'
Or simply
perl -ne 'print if /(?<=abc) (?!def)/'
Upvotes: 0
Reputation: 98398
grep does not include the newline in the string it checks against the regex, so abc\s
does not match when abc is at the end of the line. chomp in perl or use the -l command line option and you will see similar results.
I'm not sure why you were making other changes between the perl and grep regexes; what was the (?)
supposed to accomplish?
Upvotes: 9
Reputation: 945
In your perl -ne 'print if /(?)abc\s(?!def)/'
you asking perl to find abc
, then space, then string shouldn't be def
. This is successfully matches with def abc
, because there is no def
after abc
here and \s
matches with newline.
Upvotes: 2
Reputation: 6578
I would try anchoring your regex like so:
/(^abc\s+(?!def).+)/
This would capture:
abc 123
abc de
The (?)
at the beginning of your negative lookahead regex is redundant
Upvotes: 4