Lisa
Lisa

Reputation: 121

Perl - Uncompressing zip files on windows is too slow

I've created a uncompress function, put together from a few code snippets and a few alterations from my side, automatically handling the file type.

My current usecase is to extract a ~550mb zip file from a SMB share on windows with a lot of files in it (qt 5.5 source code)

On Linux, this is a tgz file on a nfs share and it takes 67 seconds for the function to extract it. (other uncompression method than for zip files)

On Windows it takes >15minutes.

I'm thinking about using a system(7z $source) call as alternative.

Do you have any suggestions what's the fastest method to extract a zip file on windows?

Plz be honest, if my uncompress function is crap, i'm no perl expert... :)

Here's my code:

#uncompress full archive file $archFile to $destPath
sub uncompress
{
  my $fileToExtract = shift;
  my $targetPath = shift;
  my $silent = shift;
  my $status;
  my $buff;

  unless (-f $fileToExtract)
  {
    &error ("$fileToExtract is not a file!"); 
  }

  unless (-d $targetPath)
  {
    &makeDir($targetPath, 1);
  }

  # just look for .tar since all .tar archives with all compressions can be extracted.
  if ($fileToExtract =~ m/.tar/)
  {
    my $pwd = getcwd();
    changeDirectory($targetPath, 1);
    my $tar = Archive::Tar->new();

    $tar->read($fileToExtract);
    $tar->extract();
    changeDirectory($pwd, 1);


    return; 
  }

  elsif ($fileToExtract =~ m/.zip$/)
  {
    my $u = new IO::Uncompress::Unzip $fileToExtract or die "Cannot open $fileToExtract: $UnzipError";

    for ($status = 1; $status > 0; $status = $u->nextStream()) 
    {
      my $header = $u->getHeaderInfo();
      my (undef, $path, $name) = splitpath($header->{Name});
      my (undef, $path, $name) = splitpath($header->{Name});
      my $destdir = "$targetPath$path";

      unless (-d $destdir)
      {
        &makeDir( $destdir, 1);
      }

      if ($name =~ m!/$!) {
        last if $status < 0;
        next;
      }


      my $destfile = "$destdir/$name";

      if ($destfile =~ m/\/\/$/) # skip if no filename is given
      {
        next;
      }

      $destfile =~ s|\/\/|\/|g; # remove unnecessary doubleslashes

      my $fh = openFileHandle ( $destfile , '>', 1 );

      binmode($fh);
      while (($status = $u->read($buff)) > 0) {
        $fh->write($buff);
      }
      $fh->close();

      unless (defined $silent)
      {
        &syslog ("Uncompress $destfile -> $targetPath");
      }

      #set timestamps of file to the ones in the zip
      my $stored_time = $header->{'Time'};
      utime ($stored_time, $stored_time, $destfile);
    }

    if ($status < 0)
    {
      die "Error processing $fileToExtract: $!\n"
    } 
  }
  else
  {
    my $ae = Archive::Extract->new( archive => $fileToExtract );
    $ae->extract( to => $targetPath ) or &error("Failed to extract $fileToExtract with error $ae->error");

    unless (defined $silent)
    {
      foreach my $file (@{$ae->files})
      {
        #only print if not a directory
        if( $file!~m|/$| )
        {
          &syslog("Uncompress $fileToExtract -> $targetPath");
        }
      }
    }
  }
  return;
}

Upvotes: 1

Views: 476

Answers (1)

Chankey Pathak
Chankey Pathak

Reputation: 21666

You could simply do it in below manner using Archive::Extract, it provides generic archive extracting mechanism, therefore you don't have to install separate modules for tar and zip.

use Archive::Extract;
my $ae = Archive::Extract->new( archive => $fileToExtract );
my $ok = $ae->extract( to => $targetPath ); 

If you specifically want to check whether a file is tar or zip then you can use below:

$ae->is_tar
$ae->is_zip

Note that Archive::Extract is a core module therefore you'll not have to install it separetely.

Upvotes: 1

Related Questions