user1539348
user1539348

Reputation: 513

perl negative look behind with groupings

I have a problem trying to get a certain match to work with negative look behind

example

@list = qw( apple banana cherry); 
$comb_tlist = join ("|", @list);
$string1 = "include $(dir)/apple";
$string2 = "#include $(dir)/apple";

if( $string1 =~ /^(?<!#).*($comb_tlist)/)   #matching regex I tried, works kinda

The array holds a set of variables that is matched against the string.

I need the regex to match $string1, but not $string2. It matches $string1, but it ALSO matches $string2. Can anyone tell me what I am attempting wrong here. Thanks!

Upvotes: 1

Views: 224

Answers (4)

perreal
perreal

Reputation: 98078

You don't need negative lookbehind, just match a first character that is not #:

use strict;
use warnings;

my @list = qw( apple banana cherry); 
my $comb_tlist = join ("|", @list);
my $string1 = "include dir/apple";
my $string2 = "#include dir/apple";

for ($string1, $string2) {
  print "match:$_\n" if( /^[^#].*($comb_tlist)/);
}

Also, if you mean to match a literal $(dir), then you need to escape the $ sign with a backslash, otherwise it denotes a scalar variable. If this is the case, "$(dir)" should be \$(dir) in Perl code.

Upvotes: 2

ikegami
ikegami

Reputation: 386386

Some problems:

  • Always use use strict; use warnings;.
  • Fix the use of string1 where you meant $string1.
  • Fix the scoping errors detected by the above by using my where appropriate.
  • Fix the typo in the variable names (@list vs @tlist).
  • I'm sure you didn't mean to interpolate the $( variable.
  • You'll never find a # before the first character of the string, so /^(?<!#).* .../ makes no sense. It simply means /^.* .../. You probably wanted /^[^#].* .../

Upvotes: 2

TLP
TLP

Reputation: 67900

The problem is that negative lookbehind and beginning of line ^ is both zero width matches. So when you say

"start at the beginning of the string"

and then say

"check that the character before it is not #"

...you actually check the character before the start of the string. Which is of course not #, because it is nothing.

Use a lookahead instead. This works:

use strict;
use warnings;

my @list = qw( apple banana cherry); 
my $comb_tlist = join ("|", @list);
my $string1 = 'include $(dir)/apple';
my $string2 = '#include $(dir)/apple';

if( $string1 =~ /^(?!#).*($comb_tlist)/)  { say "String1"; }
if( $string2 =~ /^(?!#).*($comb_tlist)/)  { say "String2"; }

Note that you have made four critical mistakes in your sample code. First off, you use string1 which is a bareword, which will be interpreted as a string. Second, you declare @list but then use @tlist. Third, you don't (seem to) use

use strict;
use warnings;

These pragmas could have informed you of your error, and without them, it is fairly likely that you would not have been warned about your first two critical errors. There is no good reason not to use them, so do that in the future.

Fourth, the declaration

$string1 = "include $(dir)/apple";

Means that you try to interpolate the variable $( in your string. $ is a meta character in double quoted strings, so you should use single quotes:

my $string1 = 'include $(dir)/apple';

Upvotes: 5

alex
alex

Reputation: 1304

Sometimes complex regexes became trivial, if you just split them in two or three. Filterout commented strings in first step.

Upvotes: 0

Related Questions