Geo
Geo

Reputation: 96987

Is there a Perl shortcut to count the number of matches in a string?

Suppose I have:

my $string = "one.two.three.four";

How should I play with context to get the number of times the pattern found a match (3)? Can this be done using a one-liner?

I tried this:

my ($number) = scalar($string=~/\./gi);

I thought that by putting parentheses around $number, I'd force array context, and by the use of scalar, I'd get the count. However, all I get is 1.

Upvotes: 95

Views: 71518

Answers (9)

PP.
PP.

Reputation: 10865

Try this:

my $string = "one.two.three.four";
my ($number) = scalar( @{[ $string=~/\./gi ]} );

It returns 3 for me. By creating a reference to an array the regular expression is evaluated in list context and the @{..} de-references the array reference.

Upvotes: 8

Alastair Skeffington
Alastair Skeffington

Reputation: 31

I noticed that if you have an OR condition in your regular expression (eg /(K..K)|(V.AK)/gi ) then the array produced may have undefined elements which are included in the count at the end.

For example:

my $seq = "TSYCSKSNKRCRRKYGDDDDWWRSQYTTYCSCYTGKSGKTKGGDSCDAYYEAYGKSGKTKGGRNNR";
my $regex = '(K..K)|(V.AK)';
my $count = () = $seq =~ /$regex/gi;
print "$count\n";

Gives a value of count of 6.

I found the solution in this post How do I remove all undefs from array?

my $seq = "TSYCSKSNKRCRRKYGDDDDWWRSQYTTYCSCYTGKSGKTKGGDSCDAYYEAYGKSGKTKGGRNNR";
my $regex = '(K..K)|(V.AK)';
my @count = $seq =~ /$regex/gi;
@count = grep defined, @count; 
my $count = scalar @count;
print "$count\n";

Which then gives the correct answer of three.

Upvotes: 1

HoldOffHunger
HoldOffHunger

Reputation: 20948

Friedo's method is: $a = () = $b =~ $c.

But it's possible to simplify this even further to just ($a) = $b =~ $c, like so :

my ($matchcount) = $text =~ s/$findregex/ /gi;

You could thank just wrap this up in a function, getMatchCount(), and not worry about it destroying the passed string.

On the other hand, you can add in a swap, which may be a bit more computation, but does not result in altering the string.

my ($matchcount) = $text =~ s/($findregex)/$1/gi;

Upvotes: -1

Tim Cadell
Tim Cadell

Reputation: 1

my $count = 0;
my $pos = -1;
while (($pos = index($string, $match, $pos+1)) > -1) {
  $count++;
}

checked with Benchmark, it's pretty fast

Upvotes: -1

friedo
friedo

Reputation: 67058

That puts the regex itself in scalar context, which isn't what you want. Instead, put the regex in list context (to get the number of matches) and put that into scalar context.

 my $number = () = $string =~ /\./gi;

Upvotes: 142

Mike
Mike

Reputation: 1851

Is the following code a one-liner?

print $string =~ s/\./\./g;

Upvotes: 9

ghostdog74
ghostdog74

Reputation: 343107

another way,

my $string = "one.two.three.four";
@s = split /\./,$string;
print scalar @s - 1;

Upvotes: -1

Robert P
Robert P

Reputation: 15988

I think the clearest way to describe this would be to avoid the instant-cast to scalar. First assign to an array, and then use that array in scalar context. That's basically what the = () = idiom will do, but without the (rarely used) idiom:

my $string = "one.two.three.four";
my @count = $string =~ /\./g;
print scalar @count;

Upvotes: 42

Robert P
Robert P

Reputation: 15988

Also, see Perlfaq4 :

There are a number of ways, with varying efficiency. If you want a count of a certain single character (X) within a string, you can use the tr/// function like so:

$string = "ThisXlineXhasXsomeXx'sXinXit";
$count = ($string =~ tr/X//);
print "There are $count X characters in the string";

This is fine if you are just looking for a single character. However, if you are trying to count multiple character substrings within a larger string, tr/// won't work. What you can do is wrap a while() loop around a global pattern match. For example, let's count negative integers:

$string = "-9 55 48 -2 23 -76 4 14 -44";
while ($string =~ /-\d+/g) { $count++ }
print "There are $count negative numbers in the string";

Another version uses a global match in list context, then assigns the result to a scalar, producing a count of the number of matches.

$count = () = $string =~ /-\d+/g;

Upvotes: 27

Related Questions