user3475234
user3475234

Reputation: 1573

How to extract a variety of zip files without knowing the extension in bash

I'm writing a bash script which needs to handle a bunch of zip files with different possibilities (primarily tar.gz, zip, and rar). Is there a tool I can get that does this, so I could call "toolname filename", and if not, how do I determine the extension of a file (so that I can make a case statement with the different required tools)?

Upvotes: 1

Views: 2761

Answers (3)

John1024
John1024

Reputation: 113834

Yes, you can make a complex shell script to handle this. But, you don't need to. The right tool is 7z. It will natively handle all the compression formats that you mention and many many more.

For example, allfiles- is a zip archive (note that the extension is missing). To list its contents, use the l (ell) function:

$ 7z l allfiles-

7-Zip [64] 9.20  Copyright (c) 1999-2010 Igor Pavlov  2010-11-18
p7zip Version 9.20 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,2 CPUs)

Listing archive: allfiles-

--
Path = allfiles-
Type = zip
Physical Size = 367

   Date      Time    Attr         Size   Compressed  Name
------------------- ----- ------------ ------------  ------------------------
2015-03-10 21:05:40 .....           29           29  file1
2015-03-10 21:05:42 .....           29           29  file2
2015-03-10 21:05:44 .....           29           29  file3
------------------- ----- ------------ ------------  ------------------------
                                    87           87  3 files, 0 folders

Note that 7z does not depend on having the right extension. It figured out the type of archive itself.

The functions supported by 7z are:

   a      Add
   d      Delete
   e      Extract
   l      List
   t      Test
   u      Update/Create
   x      eXtract with full paths

Among the file formats supported by 7z are: LZMA2, XZ, ZIP, Zip64, CAB, RAR, ARJ, GZIP, BZIP2, TAR, CPIO, RPM, ISO, as well as most filesystem images and DEB formats.

To install 7z on a Debian-like system, run:

apt-get install p7zip-full

Upvotes: 3

rici
rici

Reputation: 241701

One possibly more reliable way of recognizing a file's type is the file tool, which uses a database of identifying patterns to recognize a file's format. Some useful options:

  • file --mime-type prints only a mimetype (such as application/zip or application/x-gzip) which is easy to parse (or match)
  • file -i prints the mimetype and other parameters such as charset (not relevant for compressed files)
  • file -z also attempts to decompress the file (doesn't work with all archive formats), which is the best way to distinguish simple gzipped files from gzipped tar archives.

You almost certainly have it installed, but if not: the file homepage

Upvotes: 2

Tim Pierce
Tim Pierce

Reputation: 5664

The bash pattern matching operator ## is often used to extract part of a filename this way. If $filename contains the name of a file, then the expression ${filename##*.} is the filename after removing the longest string matching the pattern *., i.e. the filename extension.

$ filename=foo.tgz
$ echo ${filename##*.}
tgz

That might not be the best option for your situation, if some of the files really might have a compound extension like tar.gz. When the pattern you're looking for is more variable, you will probably want to use something like a nested globbing statement:

if [[ $filename = *.tar.gz ]]; then
    tar xzf $filename
elif [[ $filename = *.zip ]]; then
    unzip $filename
elif [[ $filename = *rar ]]; then
    unrar $filename
fi

Upvotes: 1

Related Questions