user1786376
user1786376

Reputation: 11

How do I count the times a text string occurs in Perl?

I am trying to count the occurrences of a text string.

My Perl code below prints a statement (text string) when it finds certain types of files and I need to count up the times it prints the string.

elsif ($elt =~ /DELETE_.+\.XML/) {
    print "  <-- Delete XMLs !!";
}

I am just trying to learn perl and I am not a programmer! So please explain any answers.

I don't want to insert, sort or merge, just count.

Upvotes: 1

Views: 586

Answers (3)

amon
amon

Reputation: 57600

If you want to count all files in a directory that have a name that will match /DELETE_.+\.XML/, I would do it like this:

  1. Open the directory.
    In Perl, this is done with

    opendir my $directory, "path/to/dir" or die "Error while opening: $!";
    

    Then, $directory is a variable that represents a handle to this directory.

  2. Take all files in the directory.
    In Perl, we can use the readdir function:

    my @files = readdir $directory;
    

    This reads all the contents of that $directory into an array called @files.

  3. Select all files that match the pattern.
    In Perl, you can select elements that satisfy a certain condition with grep:

    my @interesting_files = grep {/DELETE_.+\.XML/} @files;
    #  ^--output                 ^--a  condition--^ ^--source
    

    We enclose the condition inside curly braces. It can contain arbitrary code, but we'll just put a regular expression in here. grep is a kind of data filter.

  4. We count all the elements in the @interesting_files.
    Perl has a concept of context. There is scalar context and list context. Functions and variables behave differently in each. If an array is used in scalar context, it returns the number of elements in that array. We can force scalar context with the scalar function:

    my $count = scalar @interesting_files;
    

Together, this forms this code:

opendir my $directory, "path/to/dir" or die "Error while opening: $!";
my @files = readdir $directory;
my @interesting_files = grep {/DELETE_.+\.XML/} @files;
my $count = scalar @interesting_files;

This can be reduced to the following two lines if we omit unneccessary variables and use implicit context.

opendir my $directory, "path/to/dir" or die "Error while opening: $!";
my $count = grep {/DELETE_.+\.XML/} readdir $directory;

However, note that $count will only be visible until we leave the enclosing block ({...}). If you need $count outside of this block, you have to declare it with my in the outermost scope where it is used. Or, you don't use my at all, but that has drawbacks.


The really elegant solution uses the glob function:

my $count =()= glob "DELETE_*.XML";

This abstracts away the manual directory opening and uses the globbing syntax familiar from the Unix shells. These are not traditional regular expressions! The =()= pseudo-operator can be read as count-of. It imposes list context on the right hand side, but allows the left hand side to have scalar context.

Upvotes: 4

Kyle Burton
Kyle Burton

Reputation: 27528

The following should count the matching lines:

use strict;
use warnings;

my $count = 0;

for (<>) {
  $count++ if /line-matches/;
}

print "count: $count\n";

If you place that in a file count.pl, then you can run it it as:

perl count.pl file1 file2 file3 ...

It should also work if you need to use it in a pipeline:

ls *.XML | perl count.pl

Upvotes: 0

stark
stark

Reputation: 13189

elsif ($elt =~ /DELETE_.+.XML/) { 
   print " <-- Delete XMLs !!";
   $count++;   # Count number of times string is printed
}

Upvotes: 2

Related Questions