Reputation: 511
Delete files from a ZIP archive without decompressing using Java (Preferred) or Python
Hi,
I work with large ZIP files containing many hundreds of highly compressed text files. When I decompress the ZIP file it can take a while and easily consume up to 20 GB of diskspace. I would like to remove certain files from these ZIP files without having to decompress and recompress only the files I want.
Of course it is certainly possible to do this the long way, but very inefficient.
I would prefer to do this in Java, but will consider Python
Upvotes: 10
Views: 25253
Reputation: 17441
Yes it is possible for JAVA using library called TRUEZIP.
TrueZIP is a Java based virtual file system (VFS) which enables client applications to perform CRUD (Create, Read, Update, Delete) operations on archive files as if they were virtual directories, even with nested archive files in multithreaded environments
see below link for more information https://christian-schlichtherle.bitbucket.io/truezip/
Upvotes: 1
Reputation: 1763
clean solution with only standard library, but I'm not sure whether it's included in android sdk, to be found.
import java.util.*;
import java.net.URI;
import java.nio.file.Path;
import java.nio.file.*;
import java.nio.file.StandardCopyOption;
public class ZPFSDelete {
public static void main(String [] args) throws Exception {
/* Define ZIP File System Properies in HashMap */
Map<String, String> zip_properties = new HashMap<>();
/* We want to read an existing ZIP File, so we set this to False */
zip_properties.put("create", "false");
/* Specify the path to the ZIP File that you want to read as a File System */
URI zip_disk = URI.create("jar:file:/my_zip_file.zip");
/* Create ZIP file System */
try (FileSystem zipfs = FileSystems.newFileSystem(zip_disk, zip_properties)) {
/* Get the Path inside ZIP File to delete the ZIP Entry */
Path pathInZipfile = zipfs.getPath("source.sql");
System.out.println("About to delete an entry from ZIP File" + pathInZipfile.toUri() );
/* Execute Delete */
Files.delete(pathInZipfile);
System.out.println("File successfully deleted");
}
}
}
Upvotes: 6
Reputation: 86768
I don't have code to do this, but the basic idea is simple and should translate into almost any language the same way. The ZIP file layout is just a series of blocks that represent files (a header followed by the compressed data), finished off with a central directory that just contains all the metadata. Here's the process:
See http://en.wikipedia.org/wiki/ZIP_%28file_format%29 for all the details on the ZIP file structures.
As bestsss suggests, you might want to perform the copying into another file, so as to prevent losing data in the event of a failure.
Upvotes: 2
Reputation: 511
Ok think I found a potential solution from www.javaer.org. It definitely deletes files inside the zip and I don't think it is decompressing anything. Here is the code:
public static void deleteZipEntry(File zipFile,
String[] files) throws IOException {
// get a temp file
File tempFile = File.createTempFile(zipFile.getName(), null);
// delete it, otherwise you cannot rename your existing zip to it.
tempFile.delete();
tempFile.deleteOnExit();
boolean renameOk=zipFile.renameTo(tempFile);
if (!renameOk)
{
throw new RuntimeException("could not rename the file "+zipFile.getAbsolutePath()+" to "+tempFile.getAbsolutePath());
}
byte[] buf = new byte[1024];
ZipInputStream zin = new ZipInputStream(new FileInputStream(tempFile));
ZipOutputStream zout = new ZipOutputStream(new FileOutputStream(zipFile));
ZipEntry entry = zin.getNextEntry();
while (entry != null) {
String name = entry.getName();
boolean toBeDeleted = false;
for (String f : files) {
if (f.equals(name)) {
toBeDeleted = true;
break;
}
}
if (!toBeDeleted) {
// Add ZIP entry to output stream.
zout.putNextEntry(new ZipEntry(name));
// Transfer bytes from the ZIP file to the output file
int len;
while ((len = zin.read(buf)) > 0) {
zout.write(buf, 0, len);
}
}
entry = zin.getNextEntry();
}
// Close the streams
zin.close();
// Compress the files
// Complete the ZIP file
zout.close();
tempFile.delete();
}
Upvotes: 0