Reputation: 138
below is my attempt and loading all filenames in a text file into an array and comparing that array to filenames which are in a seperate directory. I would like to identify the filenames that are in the directory and not in the file so I can then process those files. I am able to load the contents of the both directories succesfully but the compare operation is outputting all the files not just the difference.
Thank you in advance for the assistance.
use File::Copy;
use Net::SMTP;
use POSIX;
use constant DATETIME => strftime("%Y%m%d", localtime);
use Array::Utils qw(:all);
use strict;
use warnings;
my $currentdate = DATETIME;
my $count;
my $ErrorMsg = "";
my $MailMsg = "";
my $MstrTransferLogFile = ">>//CFVFTP/Users/ssi/Transfer_Logs/Artiva/ARTIVA_Mstr_Transfer_Log.txt";
my $DailyLogFile = ">//CFVFTP/Users/ssi/Transfer_Logs/Artiva/ARTIVA_Daily_Transfer_Log_" . DATETIME . ".txt";
my $InputDir = "//CFVFTP/Users/ssi/Transfer_Logs/folder1/";
my $MoveDir = "//CFVFTP/Users/ssi/Transfer_Logs/folder2/";
my $filetouse;
my @filetouse;
my $diff;
my $file1;
my $file2;
my %diff;
open (MSTRTRANSFERLOGFILE, $MstrTransferLogFile) or $ErrorMsg = $ErrorMsg . "ERROR: Could not open master transfer log file!\n";
open (DAILYLOGFILE, $DailyLogFile) or $ErrorMsg = $ErrorMsg . "ERROR: Could not open daily log file!\n";
#insert all files in master transfer log into array for cross reference
open (FH, "<//CFVFTP/Users/ssi/Transfer_Logs/Artiva/ARTIVA_Mstr_Transfer_Log.txt") or $ErrorMsg = $ErrorMsg . "ERROR: Could not open master log file!\n";
my @master = <FH>;
close FH;
print "filenames in text file:\n";
foreach $file1 (@master) { print "$file1\n"; }
print "\n";
#insert all 835 files in Input directory into array for cross reference
opendir (DIR, $InputDir) or $ErrorMsg = $ErrorMsg . "ERROR: Could not open input directory $InputDir!\n";
my @list = grep { $_ ne '.' && $_ ne '..' && /\.835$/ } readdir DIR;
close(DIR);
print "filenames in folder\n";
foreach $file2 (@list) { print "$file2\n"; }
print "\n";
#get the all files in the Input directory that are NOT in the master transfer log and place into @filetouse array
@diff{ @master }= ();;
@filetouse = grep !exists($diff{$_}), @list;;
print "difference:\n";
foreach my $file3 (@filetouse) { print "$file3\n"; }
print DAILYLOGFILE "$ErrorMsg\n";
print DAILYLOGFILE "$MailMsg\n";
close(MSTRTRANSFERLOGFILE);
close(DAILYLOGFILE);
this is what the output looks like:
filenames in text file:
160411h00448car0007.835
filenames in folder
160411h00448car0007.835
160411h00448car0008.835
160418h00001com0001.835
difference:
160411h00448car0007.835
160411h00448car0008.835
160418h00001com0001.835
Upvotes: 0
Views: 58
Reputation: 126722
This should help you to do what you need. It stores the names of all of the files in INPUT_DIR
as keys in hash %files
, and then deletes all the names found in LOG_FILE
. The remainder are printed
This program uses autodie
so that the success of IO operations needn't be checked explicitly. It was first available in Perl 5 core in v5.10.1
use strict;
use warnings 'all';
use v5.10.1;
use autodie;
use feature 'say';
use constant LOG_FILE => '//CFVFTP/Users/ssi/Transfer_Logs/Artiva/ARTIVA_Mstr_Transfer_Log.txt';
use constant INPUT_DIR => undef;
chdir INPUT_DIR;
my %files = do {
opendir my $dh, '.';
my @files = grep -f, readdir $dh;
map { $_ => 1 } @files;
};
my @logged_files = do {
open my $fh, '<', LOG_FILE;
<$fh>;
};
chomp @logged_files;
delete @files{@logged_files};
say for sort keys %files;
After a lot of attrition I found this underneath your original code
use strict;
use warnings 'all';
use v5.10.1;
use autodie;
use feature 'say';
use Time::Piece 'localtime';
use constant DATETIME => localtime()->ymd('');
use constant XFR_LOG => '//CFVFTP/Users/ssi/Transfer_Logs/Artiva/ARTIVA_Mstr_Transfer_Log.txt';
use constant DAILY_LOG => '//CFVFTP/Users/ssi/Transfer_Logs/Artiva/ARTIVA_Daily_Transfer_Log_' . DATETIME . '.txt';
use constant INPUT_DIR => '//CFVFTP/Users/ssi/Transfer_Logs/folder1/';
use constant MOVE_DIR => '//CFVFTP/Users/ssi/Transfer_Logs/folder2/';
chdir INPUT_DIR;
my @master = do {
open my $fh, '<', XFR_LOG;
<$fh>;
};
chomp @master;
my @list = do {
opendir my $dh, '.';
grep -f, readdir $dh;
};
my %diff;
@diff{ @master } = ();
my @filetouse = grep { not exists $diff{$_} } @list;
As you can see, it's very similar to my solution. Here are some notes about your original
Always use lexical file handles. With open FH, ...
the file handle is global and will never be closed unless you do it explicitly or until the program terminates. Instead, open my $fh, ...
leaves perl to close the file handle at the end of the current block
Always use the three-parameter form of open
, so that the open mode is separate from the file name, and never put an open mode as part of a file name. You opened the same file twice: once as $MstrTransferLogFile
which begins with >>
and once explicitly because you needed read access
It is very rare for a program to be able to recover from an IO operation error. Unless you are writing fail-safe software, a failure to open or read from a file or directory means the program won't be able to fulfill its purpose. That means there's little reason to accumulate a list of error messages -- the code should just die
when it can't succeed
The output from readdir
is very messy if you need to process directories because it includes the pseudo-directories .
and ..
. But if you only want files then a simple grep -f, readdir $dh
will throw those out for you
The block form of grep is often more readable, and not
is much more visible than !
. So grep !exists($diff{$_}), @list
is clearer as grep { not exists $diff{$_} } @list
Unless your code is really weird, comments usually just add more noise and confusion and obscure the structure. Make your code look like what it does, so you don't have to explain it
Oh, and don't throw in all the things you might need at the start "just in case". Write your code as if it was all there and the compiler will tell you what's missing
I hope that helps
Upvotes: 1
Reputation: 3549
First, use a hash to store your already-processed files. Then it's just a matter of checking if a file exists in the hash.
(I've changed some variable names to make the answer a bit clearer.)
foreach my $file (@dir_list) {
push @to_process, $file unless ($already_processed{$file});
}
(Which could be a one-liner, but get it working in its most expanded form first.)
If you insist on your array, this looks much less efficient
foreach my $file (@dir_list) {
push @to_process, $file unless (grep (/^$file$/, @already_processed));
}
(Again could be a one-liner, but...)
Upvotes: 0