Reputation: 917
I have a Java program which searches for a folder with the date of yesterday and compresses it to a 7zip file and deletes it at the end. Now I have noticed that the generated 7zip archive files by my program are way too big. When I use a program like 7-Zip File Manager to compress my files it generates an archive which is 5 kb big while my program generates an archive which is 737 kb big for the same files (which have a 873 kb size). Now I am afraid that my program does not compress it to a 7zip file but do a usual zip file. Is there a way to change something in my code so that it generates a smaller 7zip file like 7-Zip File Manager would do it?
package SevenZip;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.text.SimpleDateFormat;
import java.util.Calendar;
import java.util.concurrent.TimeUnit;
import org.apache.commons.compress.archivers.sevenz.SevenZArchiveEntry;
import org.apache.commons.compress.archivers.sevenz.SevenZOutputFile;
public class SevenZipUtils {
public static void main(String[] args) throws InterruptedException, IOException {
String sourceFolder = "C:/Users/Ferid/Documents/Dates/";
String outputZipFile = "/Users/Ferid/Documents/Dates";
int sleepTime = 0;
compress(sleepTime, outputZipFile, sourceFolder);
}
public static boolean deleteDirectory(File directory, int sleepTime) throws InterruptedException {
if (directory.exists()) {
File[] files = directory.listFiles();
if (null != files) {
for (int i = 0; i < files.length; i++) {
if (files[i].isDirectory()) {
deleteDirectory(files[i], sleepTime);
System.out.println("Folder deleted: " + files[i]);
} else {
files[i].delete();
System.out.println("File deleted: " + files[i]);
}
}
}
}
TimeUnit.SECONDS.sleep(sleepTime);
return (directory.delete());
}
public static void compress(int sleepTime, String outputZipFile, String sourceFolder)
throws IOException, InterruptedException {
// finds folder of yesterdays date
final Calendar cal = Calendar.getInstance();
cal.add(Calendar.DATE, -1); // date of yesterday
String timeStamp = new SimpleDateFormat("yyyyMMdd").format(cal.getTime()); // format the date
System.out.println("Yesterday was " + timeStamp);
if (sourceFolder.endsWith("/")) { // add yesterday folder to sourcefolder path
sourceFolder = sourceFolder + timeStamp;
} else {
sourceFolder = sourceFolder + "/" + timeStamp;
}
if (outputZipFile.endsWith("/")) { // add yesterday folder name to outputZipFile path
outputZipFile = outputZipFile + " " + timeStamp + ".7z";
} else {
outputZipFile = outputZipFile + "/" + timeStamp + ".7z";
}
File file = new File(sourceFolder);
if (file.exists()) {
try (SevenZOutputFile out = new SevenZOutputFile(new File(outputZipFile))) {
addToArchiveCompression(out, file, ".");
System.out.println("Files sucessfully compressed");
deleteDirectory(new File(sourceFolder), sleepTime);
}
} else {
System.out.println("Folder does not exist");
}
}
private static void addToArchiveCompression(SevenZOutputFile out, File file, String dir) throws IOException {
String name = dir + File.separator + file.getName();
if (file.isFile()) {
SevenZArchiveEntry entry = out.createArchiveEntry(file, name);
out.putArchiveEntry(entry);
FileInputStream in = new FileInputStream(file);
byte[] b = new byte[1024];
int count = 0;
while ((count = in.read(b)) > 0) {
out.write(b, 0, count);
}
out.closeArchiveEntry();
in.close();
System.out.println("File added: " + file.getName());
} else if (file.isDirectory()) {
File[] children = file.listFiles();
if (children != null) {
for (File child : children) {
addToArchiveCompression(out, child, name);
}
}
System.out.println("Directory added: " + file.getName());
} else {
System.out.println(file.getName() + " is not supported");
}
}
}
I am using the Apache Commons Compress library
EDIT: Here is a link where I have some of the Apache Commons Compress code from.
Upvotes: 8
Views: 3596
Reputation: 3117
I don't have enough rep to comment anymore so here are my thoughts:
SevenZOutputFile
uses no (or very low) compression. As @CristiFati said, the difference in compression is odd, especially for text filesThat said, if zip really isn't an option, your last resort could be to call the proper command line directly within your program.
If pure 7z is not mandatory, another option would be to use a "tgz"-like format to emulate solid compression: first compress all files to a non-compressed file (e.g. tar format, or zip file with no compression), then compress that single file in zip mode with standard Java Deflate algorithm. Of course that will be viable only if that format is recognized by further processes using it.
Upvotes: 5
Reputation:
Use 7-Zip file archiver instead, it compresses 832 KB
file to 26.0 KB
easily:
.java
related files.Run
arguments to project properties: e "D:\\2017ASP.pdf" "D:\\2017ASP.7z"
, e
stands for encode
, "input path"
"output path"
.Results
Case1 (.pdf file ):
From 33,969 KB
to 24,645 KB
.
Case2 (.docx file ):
From 832 KB
to 26.0 KB
.
Upvotes: 5
Reputation: 10931
Commons Compress is starting a new block in the container file for each archive entry. Note the block counter here:
Not quite the answer you were hoping for, but the docs say it doesn't support "solid compression" - writing several files to a single block. See paragraph 5 in the docs here.
A quick look around found a few other Java libraries that support LZMA compression, but I couldn't spot one that could do so within the parent container file format for 7-Zip. Perhaps someone else knows of an alternative...
It sounds like a normal zip file format (e.g. via ZipOutputStream) is not an option?
Upvotes: 8