Kapil D
Kapil D

Reputation: 2660

decompress .gz file in batch

I have 100 of .gz files which I need to de-compress. I have couple of questions

a) I am using the code given at http://www.roseindia.net/java/beginners/JavaUncompress.shtml to decompress the .gz file. Its working fine. Quest:- is there a way to get the file name of the zipped file. I know that Zip class of Java gives of enumeration of entery file to work upon. This can give me the filename, size etc stored in .zip file. But, do we have the same for .gz files or does the file name is same as filename.gz with .gz removed.

b) is there another elegant way to decompress .gz file by calling the utility function in the java code. Like calling 7-zip application from your java class. Then, I don't have to worry about input/output stream.

Thanks in advance. Kapil

Upvotes: 6

Views: 15244

Answers (6)

fredarin
fredarin

Reputation: 784

a) Zip is an archive format, while gzip is not. So an entry iterator does not make much sense unless (for example) your gz-files are compressed tar files. What you want is probably:

File outFile = new File(infile.getParent(), infile.getName().replaceAll("\\.gz$", ""));

b) Do you only want to uncompress the files? If not you may be ok with using GZIPInputStream and read the files directly, i.e. without intermediate decompression.

But ok. Let's say you really only want to uncompress the files. If so, you could probably use this:

public static File unGzip(File infile, boolean deleteGzipfileOnSuccess) throws IOException {
    GZIPInputStream gin = new GZIPInputStream(new FileInputStream(infile));
    FileOutputStream fos = null;
    try {
        File outFile = new File(infile.getParent(), infile.getName().replaceAll("\\.gz$", ""));
        fos = new FileOutputStream(outFile);
        byte[] buf = new byte[100000];
        int len;
        while ((len = gin.read(buf)) > 0) {
            fos.write(buf, 0, len);
        }

        fos.close();
        if (deleteGzipfileOnSuccess) {
            infile.delete();
        }
        return outFile; 
    } finally {
        if (gin != null) {
            gin.close();    
        }
        if (fos != null) {
            fos.close();    
        }
    }       
}

Upvotes: 10

Garnet Ulrich
Garnet Ulrich

Reputation: 3009

.gz files (gzipped) can store the filename of a compressed file. So for example FuBar.doc can be saved inside myDocument.gz and with appropriate uncompression, the file can be restored to the filename FuBar.doc. Unfortunately, java.util.zip.GZIPInputStream does not support any way of reading the filename even if it is stored inside the archive.

Upvotes: 0

Peter Lawrey
Peter Lawrey

Reputation: 533442

Have you tried

gunzip *.gz

Upvotes: 0

BobMcGee
BobMcGee

Reputation: 20110

GZip is normally used only on single files, so it generally does not contain information about individual files. To bundle multiple files into one compressed archive, they are first combined into an uncompressed Tar file (with info about individual contents), and then compressed as a single file. This combination is called a Tarball.

There are libraries to extract the individual file info from a Tar, just as with ZipEntries. One example. You will first have to extract the .gz file into a temporary file in order to use it, or at least feed the GZipInputStream into the Tar library.

You may also call 7-Zip from the command line using Java. 7-Zip command-line syntax is here: 7-Zip Command Line Syntax. Example of calling the command shell from Java: Executing shell commands in Java. You will have to call 7-Zip twice: once to extract the Tar from the .tar.gz or .tgz file, and again to extract the individual files from the Tar.

Or, you could just do the easy thing and write a brief shell script or batch file to do your decompression. There's no reason to hammer a square peg in a round hole -- this is what batch files are made for. As a bonus, you can also feed them parameters, reducing the complexity of a java command line execution considerably, while still letting java control execution.

Upvotes: 0

alamar
alamar

Reputation: 19313

If you have a fixed number of files to decompress once, why don't you use existing tools for that? As Paul Morie noticed, gunzip can do that: for i in *.gz; do gunzip $i; done And it would automatically name them, stripping .gz$

On windows, try winrar, probably, or gunzip from http://unxutils.sf.net

Upvotes: 0

Paul Morie
Paul Morie

Reputation: 15778

Regarding A, the gunzip command creates an uncompressed file with the original name minus the .gz suffix. See the man page.

Regarding B, Do you need gunzip specifically, or will another compression algorithm do? There's a java port of the LZMA compression algorithm used by 7zip to create .7z files, but it will not handle .gz files.

Upvotes: 2

Related Questions