Reputation: 4108
In my perl script I want to look for some potential regex matches in certain files in many directories.
I have a hash
my %qc = ("QCNM Daily QC" => "GUN",
"Intrinsic Flood QA" => "PUN");
which will grow considerably. In a directory $STUDY_DIR
I want to look at all image header files (image1.hd
, image2.hd
etc) and look for the presence of any hash key in the text. The image header files are just plain text files. For example I want to interrogate image1.hd to see if the text "QCNM Daily QC" or "Intrinsic Flood QA" exists" If QCNM Daily QC exists I want to set a variable $study_type = "GUN", similarly if "Intrinsic Flood QA" is matched I want to set $study_type = "PUN". If no match is found I want to move on to the next image file.
Here is my code so far
#Loop through all images
for ( my $i = 1; $i <= $num_images; $i++ ) {
# Check image is of type described in %qc
# We are only interested in manipulating these files
my $match = 0; #matched qc key to image header
my $study_type; #key value for when hash key is found in image header (eg PUN)
#reset the internal iterator so a prior each() doesn't affect the loop
keys %qc;
while ( my ( $k, $v ) = each %qc ) {
my @match = grep {/$k/} glob("$STUDY_DIR/image${i}.hd");
$match = 1 if match is found then break out of loop;
}
next if $match == 0; #Not a QC image we are interested in skip to next image
}
What I am struggling to do is iterate through each hash key and see if that text exists in the image.hd. If it does exist I want to set $match = 1 and $study_type = %qc{key} and exit the loop. If it doesn't exist I want to move on to the next potential match. The hash keys are mutually exclusive. Only one key can exist pair file although there may be no keys matched in the textfile.
A solution must be in perl as I have some additional perl commands to execute on those files matched.
Upvotes: 2
Views: 865
Reputation: 35208
You're going to need to actually load the file contents to test if they contain certain strings.
I would recommend building a regular expression out of the hash keys to compare with.
The following prints out the first matching value in each file and then moves onto the next file. Note, I use Sort::Key::Natural natsort
to process the files in natural order, but this is just a style preference.
use strict;
use warnings;
use autodie;
use Sort::Key::Natural qw(natsort);
my $STUDY_DIR = '...';
my %qc = (
"QCNM Daily QC" => "GUN",
"Intrinsic Flood QA" => "PUN"
);
my $qc_re = '(?:' . join('|', map quotemeta, sort {length $b <=> length $a} keys %qc) . ')';
FILE:
for my $file ( natsort glob("$STUDY_DIR/image*.hd") ) {
open my $fh, '<', $file;
while (<$fh>) {
if (/($qc_re)/) {
print "$qc{$1} - $file\n";
next FILE;
}
}
}
Upvotes: 3
Reputation: 4108
The following solution though a little cumbersome seems to work. I'm sure a reasonable perl programmer would more than halve the number of lines of code.
#Loop through all images
for ( my $i = 1; $i <= $num_images; $i++ ) {
# Check image is of type described in %qc
# We are only interested in moving these files to QC filestore
my $match = 0; #matched qc key to image header
my $study_type = ""; #key value for when hash key is found in image header (eg PUN)
my $image_header = "$STUDY_DIR/image${i}.hd";
#reset the internal iterator so a prior each() doesn't affect the loop
keys %qc;
while ( my ( $k, $v ) = each %qc ) {
open my $FH, $image_header or die "Could not open $image_header: $!";
my (@lines) = grep /$k/, <$FH>;
#If we get a match update required fields
$match = 1 and $study_type = $qc{$k} if ( $#lines > 0 );
close $FH;
last if $match = 1;
}
print "$match, $study_type\n";
next if $match == 0; #Not a QC image we are interested in skip to next image
}
Upvotes: 1
Reputation: 1376
You can use any
from List::MoreUtils . It exits on the first successful match.
@images = glob("$STUDY_DIR/image${i}.hd)";
if (any { $_ =~ /$k/ } @images){
$match = 1;
last;
}
$study_type = $qc{$key} and last if $match == 1;
Upvotes: 0