David W.
David W.

Reputation: 107040

Perl reading zip files with IO::Uncompress::AnyUncompress

We are moving from our current build system (which is a mess) to one that uses Ant with Ivy. I'm cleaning up all the build files, and finding the jar dependencies. I thought it might be easier if I could automate it a bit, by going through the jars that are checked into the project, finding what classes they contain, then matching those classes with the various import statements in the Java code.

I have used Archive::Tar before, but Archive::Zip isn't a standard Perl module. (My concern is that someone is going to try my script, call me in the middle of the night and tell me it isn't working.)

I noticed that IO::Uncompress::AnyUncompress is a standard module, so I thought I could try IO::Uncompress::AnyUncompressor at leastIO::Uncompress::Unzip` which is also a standard module.

Unfortunately, the documentation for these modules give no examples (According to the documentation, examples are a todo).

I'm able to successfully open my jar and create an object:

 my $zip_obj = IO::Uncompress::AnyUncompress->new ( $zip_file );

Now, I want to see the contents. According to the documentation:

getHeaderInfo

Usage is

$hdr  = $z->getHeaderInfo();
@hdrs = $z->getHeaderInfo();

This method returns either a hash reference (in scalar context) or a list or hash references (in array context) that contains information about each of the header fields in the compressed data stream(s).

Okay, this isn't an object like Archive::Tar or Archive::Zip returns, and there are no methods or subroutines mentioned to parse the data. I'll use Data::Dumper and see what hash keys are contained in the reference.

Here's a simple test program:

#! /usr/bin/env perl
use 5.12.0;
use warnings;

use IO::Uncompress::AnyUncompress;
use Data::Dumper;

my $obj = IO::Uncompress::AnyUncompress->new("testng.jar")

    or die qq(You're an utter failure);

say qq(Dump of \$obj = ) . Dumper $obj;

my @header2 = $obj->getHeaderInfo;
say qq(Dump of \$header = ) . Dumper $headers->[0];

And here's my results:

Dump of $obj = $VAR1 = bless( \*Symbol::GEN0, 'IO::Uncompress::Unzip' );

Dump of $header = $VAR1 = {
          'UncompressedLength' => 0,
          'Zip64' => 0,
          'MethodName' => 'Stored',
          'Stream' => 0,
          'Time' => 1181224440,
          'MethodID' => 0,
          'CRC32' => 0,
          'HeaderLength' => 43,
          'ExtraFieldRaw' => '¦-  ',
          'ExtraField' => [
                            [
                              '¦-',
                              ''
                            ]
                          ],
          'FingerprintLength' => 4,
          'Type' => 'zip',
          'TrailerLength' => 0,
          'CompressedLength' => 0,
          'Name' => 'META-INF/',
          'Header' => 'PK
     +N¦6                 META-INF/¦-  '
        };

Some of that looks sort of useful. However, all of my entries return `'Name' => 'META-INF/``, so it doesn't look like a file name.

Is it possible to use IO::Uncompress::AnyUncompress (or even IO::Uncompress:Unzip) to read through the archive and see what files are in its contents. And, if so, how do I parse that header?

Otherwise, I'll have to go with Archive::Zip and let people know they have to download and install it from CPAN on their systems.

Upvotes: 4

Views: 4683

Answers (1)

stevenl
stevenl

Reputation: 6798

The files in the archive are compressed in different data streams, so you need to iterate through the streams to get the individual files.

use strict;
use warnings;
use IO::Uncompress::Unzip qw(unzip $UnzipError);

my $zipfile = 'zipfile.zip';
my $u = new IO::Uncompress::Unzip $zipfile
    or die "Cannot open $zipfile: $UnzipError";

die "Zipfile has no members"
    if ! defined $u->getHeaderInfo;

for (my $status = 1; $status > 0; $status = $u->nextStream) {
    my $name = $u->getHeaderInfo->{Name};
    warn "Processing member $name\n" ;

    if ($name =~ /\/$/) {
        mkdir $name;
    }
    else {
        unzip $zipfile => $name, Name => $name
            or die "unzip failed: $UnzipError\n";
    }
}

Upvotes: 4

Related Questions