dhillon
dhillon

Reputation: 3

script in perl to copy directory structure from the source to the destination

#!/usr/bin/perl -w
use    File::Copy;
use    strict;

my   $i=  "0";
my   $j=  "1";
my   $source_directory = $ARGV[$i]; 
my    $target_directory = $ARGV[$j];
#print $source_directory,"\n";
#print $target_directory,"\n";

my@list=process_files ($source_directory);
print "remaninign files\n";
print @list;
# Accepts one argument: the full path to a directory.
# Returns: A list of files that reside in that path.
sub process_files {
    my $path = shift;

    opendir (DIR, $path)
     or die "Unable to open $path: $!";

    # We are just chaining the grep and map from
    # the previous example.
    # You'll see this often, so pay attention ;)
    # This is the same as:
    # LIST = map(EXP, grep(EXP, readdir()))
    my @files = 
        # Third: Prepend the full path
        map { $path . '/' . $_}
        # Second: take out '.' and '..'
        grep { !/^\.{1,2}$/ }
        # First: get all files
        readdir (DIR);

    closedir (DIR);

    for (@files) {
        if (-d $_) {
            # Add all of the new files from this directory
            # (and its subdirectories, and so on... if any)
            push @files, process_files ($_);

        } else { #print @files,"\n";
          # for(@files)
        while(@files) 
        {  
        my $input= pop @files;
        print $input,"\n";
        copy($input,$target_directory);

             }
  }
      # NOTE: we're returning the list of files
   return @files;
  }
}

This basically copies files from source to destination but I need some guidance on how to copy the directory as well. The main thing to note here is no CPAN modules are allowed except copy, move, and path

Upvotes: 0

Views: 4622

Answers (1)

David W.
David W.

Reputation: 107090

Instead of rolling your own directory processing adventure, why not simply use File::Find to go through the directory structure for you.

#! /usr/bin/env perl

use :5.10;
use warnings;
use File::Find;

use File::Path qw(make_path);
use File::Copy;
use Cwd;

# The first two arguments are source and dest
# 'shift' pops those arguments off the front of
# the @ARGV list, and returns what was removed

# I use "cwd" to get the current working directory
# and prepend that to $dest_dir. That way, $dest_dir
# is in correct relationship to my input parameter.

my $source_dir = shift;
my $dest_dir   = cwd . "/" . shift;

# I change into my $source_dir, so the $source_dir
# directory isn't in the file name when I find them.

chdir $source_dir
    or die qq(Cannot change into "$source_dir");;

find ( sub {
   return unless -f;   #We want files only
   make_path "$dest_dir/$File::Find::dir" 
       unless -d "$dest_dir/$File::Find::dir";
   copy "$_", "$dest_dir/$File::Find::dir"
       or die qq(Can't copy "$File::Find::name" to "$dest_dir/$File::Find::dir");
}, ".");

Now, you don't need a process_files subroutine. You let File::Find::find handle recursing the directory for you.

By the way, you could rewrite the find like this which is how you usually see it in the documentation:

find ( \&wanted, ".");

sub wanted {
   return unless -f;   #We want files only
   make_path "$dest_dir/$File::Find::dir" 
       unless -d "$dest_dir/$File::Find::dir";
   copy "$_", "$dest_dir/$File::Find::dir"
       or die qq(Can't copy "$File::Find::name" to "$dest_dir/$File::Find::dir");
}

I prefer to embed my wanted subroutine into my find command instead because I think it just looks better. It first of all guarantees that the wanted subroutine is kept with the find command. You don't have to look at two different places to see what's going on.

Also, the find command has a tendency to swallow up your entire program. Imagine where I get a list of files and do some complex processing on them. The entire program can end up in the wanted subroutine. To avoid this, you simply create an array of the files you want to operate on, and then operate on them inside your program:

...
my @file_list;

find ( \&wanted, "$source_dir" );

for my $file ( @file_list ) {
    ...
}

sub wanted {
    return unless -f;
    push @file_list, $File::Find::name;
}

I find this a programming abomination. First of all, what is going on with find? It's modifying my @file_list, but how? No where in the find command is @file_list mentioned. What is it doing?

Then at the end of my program is this sub wanted function that is using a variable, @file_list in a global manner. That's bad programming practice.

Embedding my subroutine directly into my find command solves many of these issues:

my @file_list;

find ( sub {
    return unless -f;
    push @file_list;
}, $source_dir );

for my $file ( @file_list ) {
    ...
}

This just looks better. I can see that @file_list is being manipulated directly by my find command. Plus, that pesky wanted subroutine has disappeared from the end of my program. Its' the exact same code. It just looks better.


Let's get to what that find command is doing and how it works with the wanted subroutine:

The find command finds each and every file, directory, link, or whatnot located in the directory list you pass to it. With each item it finds in that directory, it passes it to your wanted subroutine for processing. A return leaves the wanted subroutine and allows find to fetch the next item.

Each time the wanted subroutine is called, find sets three variables:

  • $File::Find::name: The name of the item found with the full path attached to it.
  • $File::Find::dir: The name of the directory where the item was found.
  • $_: The name of the item without the directory name.

In Perl, that $_ variable is very special. It's sort of a default variable for many commands. That is, you you execute a command, and don't give it a variable to use, that command will use $_. For example:

print

prints out $_

return if -f;

Is the same as saying this:

if ( -f $_ ) {
    return;
}

This for loop:

for ( @file_list ) {
   ...
}

Is the same as this:

for $_ ( @file_list ) {
    ...
}

Normally, I avoid the default variable. It's global in scope and it's not always obvious what is being acted upon. However, there are a few circumstances where I'll use it because it really clarifies the program's meaning:

return unless -f;

in my wanted function is very obvious. I exit the wanted subroutine unless I was handed a file. Here's another:

return unless /\.txt$/;

This will exit my wanted function unless the item ends with '.txt'.

I hope this clarifies what my program is doing. Plus, I eliminated a few bugs while I was at it. I miscopied $File::Find::dir to $File::Find::name which is why you got the error.

Upvotes: 2

Related Questions