mariodrumblue
mariodrumblue

Reputation: 171

Use Perl to delete files with a given extension in directory and subdirectories

I am a Perl newbie. I am trying to delete all files with a certain extension in a directory (A) and all its subdirectories (B,C). I have learnt how to do so for a given directory but not recursively. That is the following does the job in the A directory but not in the B, C sub-directories.

use strict;    
use warnings;    
my $dir = "~/A/";    
unlink glob "$dir/*.log";

I have tried with

use strict;
use warnings;
use File::Find;
my $dir = "~/A";
find(\&wanted, $dir);
sub wanted { 
unlink glob "*.log";
}

but then I get a message: Can't stat ~/A: No such file or directory. While the directory is there. Any hint? Mario

Upvotes: 3

Views: 15920

Answers (7)

Mike
Mike

Reputation: 59

You can use opendir / readdir. Here is my solution for managing several directories with varying retention and optionally specifying files with or without a regex

#Add directories to be maintained "|" delimited days to keep files.
my @directories_and_retention = (
qq!$ENV{ARCDIR}|3|\\.lok\$!, #be careful
qq!$ENV{APPPATH}/ldap/logs|5!,
qq!$ENV{LOGDIR}/canary|2!,
qq!$ENV{LOGDIR}/metadata|30!,
qq!$ENV{LOGDIR}/archive|45!
);

foreach my $directory (@directories_and_retention) {
        my ($path,$retention_days,$file) = split(/\|/,$directory);

        opendir (DIR, "$path");
        my @logfiles = readdir(DIR);
        closedir (DIR);

        foreach $logfile (@logfiles) {
                next if ($logfile =~ /^\.\./);
                next if ($logfile =~ /^\./);
                next if (-d "$path/$logfile");

                if ($file) {
                        next unless ($logfile =~ /$file/);
                }

                if (-M "$path/$logfile" > $retention_days) {
                        print "$path/$logfile > $retention_days\n";
                        unlink("$path/$logfile");
                }
        }
}

Upvotes: 0

mirkobrankovic
mirkobrankovic

Reputation: 2347

It seems that Find::File has problem with "~" mark cause when I try to replace it with for example /root/ it works fine: So as @mpapec sugested change it to $ENV{HOME}

use strict;
use warnings;
use File::Find;
my $dir = "$ENV{HOME}/A";
find(\&wanted, $dir);
sub wanted {
unlink glob "*.log";
}

Upvotes: 1

David W.
David W.

Reputation: 107040

I wouldn't bother with glob if you're already using find. Might as well simply find the files you want and delete them:

use strict;
use warnings;
use File::Find;
use Env qw(HOME);

use constant {
    SUFFIX_LIST => qr/\.(log|foo|bar)$/,
    DIR_TO_CHECK => $HOME,
};

@file_list;

find ( sub {
    return unless -f;
    return unless $_ ~= SUFFIX_LIST;
    push @file_list, $File::Find::name;
}, DIR_TO_CHECK );

unlink @file_list;

I've defined a regular expression (That's the qr/.../) that defines the list of suffixes I'm interested in. I set my constant SUFFIX_LIST to this regular expression. If my file's name matches my regular expression, it's a file I want to delete.

I define a @file_list which I do mainly out of habit and because of the way find works. I am not a big find fan, but that's what we have. The problem is that find wants all of your code inside the find subroutine and this is a bad practice. To get around this, I have my find subroutine push files I want into an array, then operate on that array.

In this particular program, I could have done my unlink right in the find since it is so short. However, most of the time, you're better off using this technique.

The find function uses two special package variables, $File::Find::name and $file::Find::dir. The first is the name of the file with the full path on it starting with the name of the directory given to the find command. The second is the name of the directory (full path). The find function also sets $_ to the current file name. Since find is actually in the directory with the file, $_ has no directory name on it, and can be used to test the file.

I do two tests: 1). Is this a file?, and 2). Does this file's name end with one of the suffixes I'm interested in. (Note that the first, I can simply use unless -f while the second, I must specify the $_ variable.).

If the file is a file and has the right suffix, I push it into my @file_list array.

I prefer to embed my wanted subroutine into my find command. It keeps the function together with the code that affects it. The following two are equivalent:

find ( sub {
    return unless -f;
    return unless $_ ~= SUFFIX_LIST;
    push @file_list, $File::Find::name;
}, DIR_TO_CHECK );

and

find (\&wanted, DIR_TO_CHECK );

sub wanted {
    return unless -f;
    return unless $_ ~= SUFFIX_LIST;
    push @file_list, $File::Find::name;
};

I use constants for things that really are constants. It's a good programming habit. Perl constants are a bit funky in that they have no sigil on them. Thus, you have to be careful whenever you use them where they could be confused with a string.

I also use use Env to pull in environment variables I want to define, and only those. I could pull them in via the $ENV{HOME} construct. It depends upon your preferences. The $ENV{..} construct makes it clear you're pulling in an environment variable. The use Env is cleaner looking.

Upvotes: 1

chrsblck
chrsblck

Reputation: 4088

You are right that glob with not recurse into child directories.

I would run the following code as-is so you can visualize what it's doing. Once you understand you can either turn $DEBUG off or remove that from the code.

#!/usr/bin/perl

use warnings;
use strict;
use File::Find;

my $path = "$ENV{HOME}/A";
my $DEBUG = 1;

find(\&wanted, $path);

sub wanted {
    return if ! -e; 

    my $file = $File::Find::name;

    if ($DEBUG) {
        if( $file =~ /\.log$/ ) { 
            print "Log file found: $file\n"
        } else {
            print "Non-log file found: $file\n";
        }   
    } else {
        # anything that ends with '.log'
        unlink $file if $file =~ /\.log$/;
    }   
}

Upvotes: 1

chooban
chooban

Reputation: 9256

Are you running on Linux? If so, I have an alternate solution which might help. I'm going on the basis that without stating the language required, the problem is "I need to delete all files with a certain extension, and do it recursively". If this is part of a larger bit of work, ignore my answer, if you're just doing some admin, it might work:

find . -type f -name "*.ext" -exec rm {} \;

This will find all of the files in the current directory and below, then pass their paths to the rm command.

Upvotes: 0

Birei
Birei

Reputation: 36262

In your second script, inside the find function don't do another search because that function already traverses the tree using recursion. Simply compare if the file is a log and delete it. A one-liner:

perl -MFile::Find -e '
    find( 
        sub { m/\.log$/ and do { unlink $_ or warn qq|Could not unlink file _$\n| } 
        }, 
        shift 
    )
' .

It accepts an argument, . in my case to begin the search at current directory.

Upvotes: 4

mpapec
mpapec

Reputation: 50647

Try with $ENV{"HOME"} instead of ~ which is shell specific,

use strict;    
use warnings;    
my $dir = "$ENV{HOME}/A";
unlink glob "$dir/*.log";

Upvotes: 5

Related Questions