Reputation: 394
I am writing a small program to retrieve a large number of XML files. The program sort of works, but no matter which solution from stackoverflow I use, every XML file I save locally misses the end of the file. By "the end of the file" I mean approximately 5-10 lines of xml code. The files are of different length (~500-2500 lines) and the total length doesn't seem to have an effect on the size of the missing bit. Currently the code looks like this:
package plos;
import static org.apache.commons.io.FileUtils.copyURLToFile;
import java.io.File;
public class PlosXMLfetcher {
public PlosXMLfetcher(URL u,File f) {
try {
org.apache.commons.io.FileUtils.copyURLToFile(u, f);
} catch (IOException ex) {
Logger.getLogger(PlosXMLfetcher.class.getName()).log(Level.SEVERE, null, ex);
}
}
}
I have tried using BufferedInputStream
and ReadableByteChannel
as well. I have tried running it in threads, I have tried using read
and readLine
. Every solution gives me an incomplete XML file as return.
In some of my tests (I can't remember which, sorry), I got a socket connection reset error - but the above code executes without error messages.
I have manually downloaded some of the XML files as well, to check if they are actually complete on the remote server - which they are.
Upvotes: 1
Views: 736
Reputation: 28016
I'm guessing that somewhere along the way a BufferedWriter or BufferedOutputStream has not had flush()
called on it.
Why not write your own copy function to rule out FileUtils.copyURLToFile(u, f)
public void copyURLToFile(u, f) {
InputStream in = u.openStream();
try {
FileOutputStream out = new FileOutputStream(f);
try {
byte[] buffer = new byte[1024];
int count;
while ((count = in.read(buffer) > 0) {
out.write(buffer, 0, count);
}
out.flush();
} finally {
out.close();
}
} finally {
in.close();
}
}
Upvotes: 1