Reputation: 13062
I have list of substrings which I need to match within a list of URL strings. The substrings have special characters like '|', '*', '-', '+' etc. If the URL strings contains that substring I need to do some operation. But for now lets just say I will print "TRUE" in the console.
I did this by first reading from the list of substrings and putting it into a hash. I then tried to perform a simple Regexp match of the entire list for each URL until a match is found. The code is something like this.
open my $ADS, '<', $ad_file or die "can't open $ad_file";
while(<$ADS>) {
chomp;
$ads_list_hash{$lines} = $_;
$lines ++;
}
close $ADS;
open my $IN, '<', $inputfile or die "can't open $inputfile";
my $first_line = <$IN>;
while(<$IN>) {
chomp;
my @hhfile = split /,/;
for my $count (0 .. $lines) {
if($hhfile[9] =~ /$ads_list_hash{$count}/) {
print "$hhfile[9]\t$ads_list_hash{$count}\n";
print "TRUE !\n";
last;
}
}
}
close $IN;
The problem is that the substrings have a lot of special characters which is causing errors in the match $hhfile[9] =~ /$ads_list_hash{$count}/
. Few examples are;
+adverts/
.to/ad.php|
/addyn|*|adtech;
I get an error in lines like these which basically says "Quantifier follows nothing in regexp". Do I need to chanhge something in the regexp matching syntax to avoid these?
Upvotes: 5
Views: 17505
Reputation: 454960
You need to escape the special characters in the string.
Enclosing the string between \Q
and \E
will do the job:
if($hhfile[9] =~ /\Q$ads_list_hash{$count}\E/) {
Upvotes: 14