Reputation: 13190
I have a folder containing files created on linux which i currently tar and compress with gzip(i.e tar.gz)
Then at a later stage the file is copied to another linux machine into one file system and extracted into another filesystem using Java.
My problem is the archive is 3GB compressed, 5GB uncompressed. The two file systems are 4GB and 6GB I copied the compressed archive to the 4GB fs but when I try to uncompress it to the 6GB it is copied to the 6GB fs as it is uncompressed, so the 6GB needs enough space for both the compressed and uncompressed formats which it does not.
I'm unclear why its creating this interim file, if I just do
cd destination folder
tar -zxvf source file
it works without running out of space, but I need to uncompress it using pure Java not the command line
Is there a better way to compress the folder as Im not constrained to any particular format as long as it can be uncompressed with Java code. I cannot modify/reconfigure the size of the two filesystems - it needs to work within these boundaries.
Upvotes: 0
Views: 450
Reputation: 13190
FYI:Just realized that in a tar.gz file the files are tarred and then then tar file is gzipped so when uncompressing the interim step of the unzipping to a tar is difficult to avoid. However if I manually gzip each file and then tar as follows:
cd foldertozip
gzip *
cd ..
tar -cvf foldertozip.tar folderzip
the size of foldertozip.tar is exactly the same as the original foldertozip.tar.gz but the interim step is not required.
Then later on I can:
So the only additional temporary space we use on 6GB fs is what is required as decompress each gz file.
I've tested this out and it worked for me.
Upvotes: 1
Reputation: 11440
You got me curious about this one, and yea it wasn't to difficult. I used a TCP Server and Client just to completely separate out the Input/Output streams to ensure there were no shenanigans.
Essentially read in the raw ZIP data on the server and send it to the client. The client then interprets that data as a ZipInputStream
and writes all the entries to an output folder. Turns out you dont even need to send in big chunks of data, only the buffers are really allocated. I profiled it sending over a 200mb zip file and the memory consumption barely got off the ground.
You do get a nice SocketException
at the end but that is expected because I didnt add hardly any error handling other than the required. The client closes the connection and the server doesnt like that so it throws an error but all the data is done so who cares!
I wrote This code for ZIP files because I wasnt paying attention but I figured I would post anway. You can adapt it to use a TAR input stream using some libraries online But the code should give the general Jist.
/**
* @param args
* @throws Exception
*/
public static void main(String[] args) throws Exception {
Object serverWait = new Object();
startServer(serverWait);
synchronized (serverWait) {
// make sure our server is started and accepting clients, otherwise we run the risk of starting the client before the server is started
serverWait.wait(2000);
}
startClient();
}
private static void startServer(final Object serverWait) {
new Thread(new Runnable() {
@Override
public void run() {
ServerSocket serverSocket = null;
Socket socket = null;
InputStream is = null;
try {
serverSocket = new ServerSocket(5555);
synchronized (serverWait) {
serverWait.notify();
}
socket = serverSocket.accept();
System.out.println("Client accepted, sending data");
// just send over the raw zip file and let the client sort through how to parse it
is = new FileInputStream("f:\\so\\zip_transfer\\ZipFile.zip");
int numRead = 0;
byte [] buffer = new byte[2048];
while((numRead = is.read(buffer)) != -1) {
socket.getOutputStream().write(buffer, 0, numRead);
}
} catch (IOException e) {
e.printStackTrace();
} finally {
safeClose(socket);
safeClose(serverSocket);
safeClose(is);
}
}
}).start();
}
private static void startClient() {
new Thread(new Runnable() {
@Override
public void run() {
Socket socket = null;
ZipInputStream is = null;
try {
socket = new Socket("127.0.0.1", 5555);
System.out.println("Client connected, retrieving data");
// the data we are receiving is in zip format
is = new ZipInputStream(socket.getInputStream());
extactZipInputStream(is, new File("f:\\so\\zip_transfer\\OutputDirectory"));
} catch (IOException e) {
e.printStackTrace();
} finally {
safeClose(socket);
safeClose(is);
}
}
}).start();
}
public static void extactZipInputStream(ZipInputStream is, File outputFolder) throws ZipException, IOException {
ZipEntry entry = null;
// Just keep going until we dont have any entries left.
while((entry = is.getNextEntry()) != null) {
System.out.println("Entry: " + entry.getName());
File file = new File(outputFolder, entry.getName());
if(entry.isDirectory()) {
// make all the path a direcotyr
file.mkdirs();
} else {
// last one isnt a directory its our file, only make our parents
file.getParentFile().mkdirs();
// write the file to the system
FileOutputStream fos = new FileOutputStream(file);
int numRead = 0;
byte [] buffer = new byte[2048];
while((numRead = is.read(buffer)) != -1) {
fos.write(buffer, 0, numRead);
}
fos.close();
}
is.closeEntry();
}
}
private static void safeClose(Closeable closable) {
try {
if(closable != null) {
closable.close();
}
} catch (IOException e) {
e.printStackTrace();
}
}
Upvotes: 0