David M. Karr
David M. Karr

Reputation: 15225

Use Archive::Zip to determine if a member is a text file or not

I'm working on a script that will grep the contents of members of zip archives when the member name matches a pattern, using a given search string.

I have the following sub that processes a single archive (the script can take more than one archive on the command line):

sub processArchive($$$$) {
    my ($zip, $searchstr, $match, $zipName) = @_;
    print "zip[$zip] searchstr[$searchstr] match[$match] zipName[$zipName]\n";
    my @matchingList = $zip->membersMatching($match);
    my $len = @matchingList;
    if ($len > 0) {
       print $zipName . ":\n";
       for my $member (@matchingList) {
          print "member[$member]\n";
          print "textfile[" . $member->isTextFile() . "] contents[" . $member->contents() . "]\n";
          if ($member->isTextFile()) {
             print "Is a text file.\n";
          }
          else {
             print "Is not a text file.\n";
          }
          my @matchingLines = grep /$searchstr/, $member->contents();
          my $len = @matchingLines;
          if ($len > 0) {
             print @matchingLines;              
          }
      }
   }
}

The logic isn't even complete yet. I'm first experimenting with calling "isTextFile()" to see what it does. I must be doing something wrong, because I get "Is not a text file" for at least one member that is clearly a text file.

I also note that when I print the value of the return from "isTextFile()", it's always an empty string. Is that what I should expect from printing a "true" or "false" value, or is something else wrong here?

Upvotes: 0

Views: 73

Answers (1)

user149341
user149341

Reputation:

The "text file" status is read from a flag in the ZIP file. Many archiving tools do not set this flag properly, as it is rarely used and has no impact on normal use.

If you actually need to check whether a file contains text, you will need to extract it and see for yourself.

Upvotes: 2

Related Questions