Reputation: 8969
I am writing a Perl code, using substr
to extract characters one by one but encountered a very strange problem.
I am trying to do the following
Scan character one by one, if it is #
go to end of line, if it is '
or "
then find the next matching one. Also added HTML color tag to highlight them. Everything else just print.
Here is the block of code
while ($char = (substr $src, $off_set, 1)) {
if ($char eq '#') {
$end_index = index $src, "\n", $off_set+ 1;
my $c = substr($src, $off_set, $end_index-$off_set+1);
print $comment_color.$c.$color_end;
} elsif (($char eq '"') || ($char eq "'")) {
$end_index = index ($src, $char, $off_set+1);
my $char_before = substr $src, $end_index-1, 1;
while ($end_index > 0 && $char_before eq '\\') {
$end_index = index $src, $char, $end_index + 1;
$char_before = substr $src, $end_index-1, 1;
}
my $s = substr($src, $off_set, $end_index-$off_set+1);
print $string_color.$s.$color_end;
} else {
print $char;
$end_index++;
}
$off_set = $end_index + 1;
}
When I use the following testing code, the script will just exit on first 0
, if I remove all the 0
then it runs ok. If I remove first 0
, it will exit on 2nd. I really have no idea why this happens.
# Comment 1
my $zero = 0;
my @array = (0xdead_beef, 0377, 0b011011);
# xor
sub sample2
{
print "true or false";
return 3 + 4 eq " 7"; # true or false
}
#now write input to STDOUT
print $time . "\n";
my $four = "4";
Upvotes: 1
Views: 94
Reputation: 8969
Finally, figured out it is the while loop. It exit the loop when it sees a 0
.
Updated the while
loop condition to
while (($char = (substr $src, $off_set, 1)) || ($off_set < (length $src))) {
and it is working now.
Upvotes: 0
Reputation: 57656
This is your loop condition:
while ($char = (substr $src, $off_set, 1)) {
...
So what happens when $char = "0"
? As Perl considers that to be a false value, the loop will terminate. Instead, loop as long as characters are left:
while ($off_set < length $src) {
my $char = substr $src, $off_set, 1;
...
Anyway, your code is convoluted and hard to read. Consider using regular expressions instead:
use re '/xsm';
my $src = ...;
pos($src) = 0;
my $out = '';
while (pos($src) < length $src) {
if ($src =~ m/\G ([#][^\n]*)/gc) {
$out .= colored(comment => $1);
}
elsif ($src =~ m/\G (["] (?:[^"\\]++|[\\].)* ["])/gc) {
$out .= colored(string => $1);
}
elsif ($src =~ m/\G (['] (?:[^'\\]++|[\\]['\\])* ['])/gc) {
$out .= colored(string => $1);
}
elsif ($src =~ m/\G ([^"'#]+)/gc) {
$out .= $1;
}
else {
die "illegal state";
}
}
where colored
is some helper function.
Upvotes: 3
Reputation: 35218
Check for defined
in your while
loop:
while (defined(my $char = substr $src, $off_set, 1)) {
The reason why your code was exiting early is because '0'
is a false value, and therefore the while
would end. Instead, this will check if any value is pulled from the substr
call.
Upvotes: 4