Reputation: 23
I have a string with multiple sequences of consecutive String like:
my $substring = "CAG"; my $str = "CAGCAGCAGCAGPGHSMCAGCAG";
I want to calculate the max repeated substring in the str.
Upvotes: 0
Views: 149
Reputation: 98398
my $substring = 'CAG';
my $str = 'CAGCAGCAGCAGPGHSMCAGCAG';
# look for a series of consecutive $substring not followed later by a longer such series
my ($longest_substring) = $str =~ /((?:\Q$substring\E)+)(?!.*?\1\Q$substring\E)/s;
my $repetitions = length($longest_substring // '') / length($substring);
Upvotes: 2
Reputation: 424
Try this:
my $number = () = $str =~ /$substring/gi;
print $number;
Upvotes: 0
Reputation: 241888
The matching operator with the /g
modifier in list context returns all the matches. To count them, we can impose scalar context to the result:
my @matches = $str =~ /$substring/g;
my $count = scalar @matches;
which returns 6.
It can be further shortened to
my $count = () = $str =~ /$substring/g;
Where the () =
assignment forces list context, but assigning it to a scalar variable forces the scalar context.
Note that this doesn't report the correct number if the matches are overlapping, e.g.
my $str = 'CACACAC';
my $substring = 'CAC';
The above expression would return 2, because matching with /g
starts searching for the next match where the last match ended. To fix that, use the look-ahead assertion which doesn't consume the matching part:
my $count = () = $str =~ /(?=$substring)/g;
Upvotes: 1