Reputation: 161
I am having a requirement wherein i have to create a zip file from a list of available files. The files are of different types like txt,pdf,xml etc.I am using java util classes to do it.
The requirement here is to maintain a maximum file size of 5 mb. I should select the files from list based on timestamp, add the files to zip until the zip file size reaches 5 mb. I should skip the remaining files.
Please let me know if there is a way in java where in i can estimate the zip file size in advance without creating actual file?
Or is there any other approach to handle this
Upvotes: 16
Views: 14077
Reputation: 426
There is a better option. Create a dummy LengthOutputStream
that just counts the written bytes:
public class LengthOutputStream extends OutputStream {
private long length = 0L;
@Override
public void write(int b) throws IOException {
length++;
}
public long getLength() {
return length;
}
}
You can just simply connect the LengthOutputStream
to a ZipOutputStream
:
public static long sizeOfZippedDirectory(File dir) throws FileNotFoundException, IOException {
try (LengthOutputStream sos = new LengthOutputStream();
ZipOutputStream zos = new ZipOutputStream(sos);) {
... // Add ZIP entries to the stream
return sos.getLength();
}
}
The LengthOutputStream
object counts the bytes of the zipped stream but stores nothing, so there is no file size limit. This method gives an accurate size estimation but almost as slow as creating a ZIP file.
Upvotes: 2
Reputation: 1871
just wanted to share how we implemented manual way
int maxSizeForAllFiles = 70000; // Read from property
int sizePerFile = 22000; // Red from property
/**
* Iterate all attachment list to verify if ZIP is required
*/
for (String attachFile : inputAttachmentList) {
File file = new File(attachFile);
totalFileSize += file.length();
/**
* if ZIP required ??? based on the size
*/
if (file.length() >= sizePerFile) {
toBeZipped = true;
logger.info("File: "
+ attachFile
+ " Size: "
+ file.length()
+ " File required to be zipped, MAX allowed per file: "
+ sizePerFile);
break;
}
}
/**
* Check if all attachments put together cross MAX_SIZE_FOR_ALL_FILES
*/
if (totalFileSize >= maxSizeForAllFiles) {
toBeZipped = true;
}
if (toBeZipped) {
// Zip Here iterating all attachments
}
Upvotes: 0
Reputation: 41096
+1 for Colin Herbert: Add files one by one, either back up the previous step or removing the last file if the archive is to big. I just want to add some details:
Prediction is way too unreliable. E.g. a PDF can contain uncompressed text, and compress down to 30% of the original, or it contains already-compressed text and images, compressing to 80%. You would need to inspect the entire PDF for compressibility, basically having to compress them.
You could try a statistical prediction, but that could reduce the number of failed attempts, but you would still have to implement above recommendation. Go with the simpler implementation first, and see if it's enough.
Alternatively, compress files individually, then pick the files that won't exceedd 5 MB if bound together. If unpacking is automated, too, you could bind the zip files into a single uncompressed zip file.
Upvotes: 3
Reputation: 93157
Wrap your ZipOutputStream into a personalized OutputStream, named here YourOutputStream.
ZipOutputStream
(zos2) which wraps a new ByteArrayOutputStream
(baos)public YourOutputStream(ZipOutputStream zos, int maxSizeInBytes)
YourOutputStream
, it will first write it on zos2public void writeFile(File file) throws ZipFileFullException
public void writeFile(String path) throws ZipFileFullException
baos.size()
is under maxSizeInBytes
You need two ZipOutputStream, one to be written on your drive, one to check if your contents is over 5MB.
EDIT : In fact I checked, you can't remove a ZipEntry easily.
http://download.oracle.com/javase/6/docs/api/java/io/ByteArrayOutputStream.html#size()
Upvotes: 12
Reputation: 3595
Maybe you could add a file each time, until you reach the 5MB limit, and then discard the last file. Like @Gopi, I don't think there is any way to estimate it without actually compressing the file.
Of course, file size will not increase (or maybe a little, because of the zip header?), so at least you have a "worst case" estimation.
Upvotes: 0
Reputation: 5407
I did this once on a project with known input types. We knew that general speaking our data compressed around 5:1 (it was all text.) So, I'd check the file size and divide by 5...
In this case, the purpose for doing so was to check that files would likely be below a certain size. We only needed a rough estimate.
All that said, I have noticed zip applications like 7zip will create a zip file of a certain size (like a CD) and then split the zip off to a new file once it reaches the limit. You could look at that source code. I have actually used the command line version of that app in code before. They have a library you can use as well. Not sure how well that will integrate with Java though.
For what it is worth, I've also used a library called SharpZipLib. It was very good. I wonder if there is a Java port to it.
Upvotes: 0
Reputation: 10293
I dont think there is any way to estimate the size of zip that will be created because the zips are processed as streams. Also it would not be technically possible to predict the size of the created compressed format unless you actually compress it.
Upvotes: 0