Reputation: 21
I was hoping to get a little explanation I have the following script:
open (FILE, '2.txt');
@DNA = <FILE>;
$DNA = join ('', @DNA);
print "DNA = ". $DNA . "\n";
$a=0;
while ($DNA =~ //ig) {$a++;}
print "Total characters = ".$a."\n";
$b=0;
while ($DNA =~ /fl/ig) {$b++;}
print "Total fl = ".$b."\n";
$c=0;
while ($DNA =~ /[^fl]/ig) {$c++;}
print "Total character less fl = ".$c."\n";
exit;
The text document "2.txt" contains the following characters:
flkkkklllkkfewnofnewofewfl
When I run the script I get the following outputs:
DNA = flkkkklllkkfewnofnewofewfl
Total characters = 27
Total fl = 2
Total character less fl = 16
My question is, why when I do
while ($DNA =~ /fl/ig) {$b++;}
if counts all the instances of fl together,
but when I do
while ($DNA =~ /[^fl]/ig) {$c++;}
it counts the number of characters that
are neither an f or and l (i.e. the f & the l are treated separately).
I was looking for the script to count the number of characters that are not fl (i.e. treated together)
Upvotes: 1
Views: 52
Reputation:
[fl]
is a character class, means f or l.
It doesn't mean the substring fl
.
So [^fl]
counts all the characters that are not f or l.
However, you could do that with a regex like this -
/[^fl]|f(?!l)|(?<!f)l/
Formatted:
[^fl] # Not f nor l
| f (?! l ) # f not followed by l
| (?<! f ) l # l not following f
Upvotes: 2
Reputation: 597
Keeping it simple, maybe consider dropping all the instances of "fl" first, then simply counting the remaining characters:
$DNA =~ s/fl//g;
print "Total characters less fl = ".length($DNA)."\n";
Upvotes: 0