Sproggit
Sproggit

Reputation: 71

Getting the names and sizes of all the files in sub-directories using PERL

I have written a routine to read all the names and sizes of the files contained in two folders and all their sub-folders. The root folder names are supplied in the command line arguments and each folder is processed in a for loop, with the details being output to a separate file for each of the two folders. But I am finding that only the filenames/sizes of the files in the two root folders are being output, I am unable to change into the sub-folders and repeat the process, so the files in the sub-directories are ignored. Debug traces show that the chdir command is never executed so there is something wrong with my evaluation but I can't see what it is. The code looks like this

#!/usr/bin/perl
#!/usr/local/bin/perl

use strict;
use warnings;

my $dir = $ARGV[0];
opendir(DIR, $dir) or die "Could not open directory '$dir' $!";
my @subdirs = readdir(DIR) or die "Unable to read directory '$dir': $!";

for (my $loopcount = 1; $loopcount < 3; $loopcount = $loopcount + 1) {
    my $filename = 'FileSize_'.$dir.'.txt';
    open (my $fh, '>', $filename) or die "Could not open file '$filename' $!";      
    for my $subdir (sort @subdirs) {
        unless (-d $subdir) {
            # Ignore Sub-Directories in this inner loop
            # only process files
            # print the file name and file size to the output file
            print "Processing files\n";
            my $size = -s "$dir/$subdir";
            print $fh "$subdir"," ","$size\n";
        }
        elsif (-s "$dir/$subdir") {
        # We are here because the entry is a sub-folder and not a file
        # if this sub-folder is non-zero size, i.e has files then
        # change to this directory and repeat the outer for loop
            chdir $subdir;
            print "Changing to directory $subdir\n";
            print "Processing Files in $subdir\n";
        };
    }
    # We have now processed all the files in First Folder and all it's subdirecorries
    # Now assign the second root directory to the $dir variable and repeat the loop
    print "Start For Next Directory\n";
    $dir = $ARGV[1];
    opendir(DIR, $dir) or die "Could not open directory '$dir' $!";
    @subdirs = readdir(DIR) or die "Unable to read directory '$dir': $!";;
}
exit 0;

Command line invocation is "perl FileComp.pl DiskImage DiskImage1" But only the file names and file sizes of the files in the root DiskImage and DiskImage1 folders are output, all the files in the sub-folders are ignored. The code to change to the "elseif" condition is never met and the code never executed, so there is an error there. Thanks in advance for any advice.

Upvotes: 1

Views: 1088

Answers (2)

Grinnz
Grinnz

Reputation: 9231

It's much easier to do logic like this without changing directories, but if you do use File::chdir or File::pushd so that you return to the previous directory when exiting that scope. However, this problem is much easier to solve by using a recursive iterator like Path::Iterator::Rule that handles the subdirectory logic:

use strict;
use warnings;
use Path::Iterator::Rule;
use Path::Tiny;

my $rule = Path::Iterator::Rule->new->not_directory;
foreach my $dir (@ARGV) {
    my $fh = path("FileSize_$dir.txt")->openw;
    my $next = $rule->iter($dir);
    while (defined(my $item = $next->())) {
        my $size = -s $item;
        print $fh "$item $size\n";
    }
}

Alternatively you can use the visitor callback which gets passed both the full path (for file operations) and basename of each item:

my $rule = Path::Iterator::Rule->new->not_directory;
foreach my $dir (@ARGV) {
    my $fh = path("FileSize_$dir.txt")->openw;
    $rule->all($dir, {visitor => sub {
        my ($path, $basename) = @_;
        my $size = -s $path;
        print $fh "$basename $size\n";
    }});
}

Upvotes: 2

Chris Turner
Chris Turner

Reputation: 8142

This check will most likely always be wrong because you're looking at the wrong thing.

   unless (-d $subdir) {

$subdir is the filename of a file or directory inside $dir so to access it you need to use the full relative path of $dir/$subdir just like you're doing here:

        my $size = -s "$dir/$subdir";

You'll also have problems if you do fix that unless check because doing chdir will also cause problems as you're doing it whilst in the middle of reading $dir's content so will be in the wrong place to see later instances of $dir/$subdir.

Upvotes: 3

Related Questions