Reputation: 11
I am trying to count the occurrences of a text string.
My Perl code below prints a statement (text string) when it finds certain types of files and I need to count up the times it prints the string.
elsif ($elt =~ /DELETE_.+\.XML/) {
print " <-- Delete XMLs !!";
}
I am just trying to learn perl and I am not a programmer! So please explain any answers.
I don't want to insert, sort or merge, just count.
Upvotes: 1
Views: 586
Reputation: 57600
If you want to count all files in a directory that have a name that will match /DELETE_.+\.XML/
, I would do it like this:
Open the directory.
In Perl, this is done with
opendir my $directory, "path/to/dir" or die "Error while opening: $!";
Then, $directory
is a variable that represents a handle to this directory.
Take all files in the directory.
In Perl, we can use the readdir
function:
my @files = readdir $directory;
This reads all the contents of that $directory
into an array called @files
.
Select all files that match the pattern.
In Perl, you can select elements that satisfy a certain condition with grep
:
my @interesting_files = grep {/DELETE_.+\.XML/} @files;
# ^--output ^--a condition--^ ^--source
We enclose the condition inside curly braces. It can contain arbitrary code, but we'll just put a regular expression in here. grep
is a kind of data filter.
We count all the elements in the @interesting_files
.
Perl has a concept of context. There is scalar context and list context. Functions and variables behave differently in each. If an array is used in scalar context, it returns the number of elements in that array. We can force scalar context with the scalar
function:
my $count = scalar @interesting_files;
Together, this forms this code:
opendir my $directory, "path/to/dir" or die "Error while opening: $!";
my @files = readdir $directory;
my @interesting_files = grep {/DELETE_.+\.XML/} @files;
my $count = scalar @interesting_files;
This can be reduced to the following two lines if we omit unneccessary variables and use implicit context.
opendir my $directory, "path/to/dir" or die "Error while opening: $!";
my $count = grep {/DELETE_.+\.XML/} readdir $directory;
However, note that $count
will only be visible until we leave the enclosing block ({...}
). If you need $count
outside of this block, you have to declare it with my
in the outermost scope where it is used. Or, you don't use my
at all, but that has drawbacks.
The really elegant solution uses the glob
function:
my $count =()= glob "DELETE_*.XML";
This abstracts away the manual directory opening and uses the globbing syntax familiar from the Unix shells. These are not traditional regular expressions! The =()=
pseudo-operator can be read as count-of. It imposes list context on the right hand side, but allows the left hand side to have scalar context.
Upvotes: 4
Reputation: 27528
The following should count the matching lines:
use strict;
use warnings;
my $count = 0;
for (<>) {
$count++ if /line-matches/;
}
print "count: $count\n";
If you place that in a file count.pl, then you can run it it as:
perl count.pl file1 file2 file3 ...
It should also work if you need to use it in a pipeline:
ls *.XML | perl count.pl
Upvotes: 0
Reputation: 13189
elsif ($elt =~ /DELETE_.+.XML/) {
print " <-- Delete XMLs !!";
$count++; # Count number of times string is printed
}
Upvotes: 2