Reputation: 87
I am experimenting with Java and created a small program that copies a file and generates a MD5 checksum. The program works and generates a checksum, but the resulting file that is copied does not match the original checksum.
I am new to Java and do not understand what the problem is here. Am I writing the wrong buffer to the output file?
package com.application;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.io.OutputStream;
import java.math.BigInteger;
import java.security.MessageDigest;
public class Main {
static int secure_copy(String src, String dest) throws Exception {
InputStream inFile = new FileInputStream(src);
OutputStream outFile = new FileOutputStream(dest);
MessageDigest md = MessageDigest.getInstance("MD5");
byte[] buf = new byte[1024];
int numRead;
do {
numRead = inFile.read(buf);
if (numRead > 0) {
md.update(buf, 0, numRead);
outFile.write(buf);
outFile.flush();
}
} while (numRead != -1);
inFile.close();
outFile.close();
BigInteger no = new BigInteger(1, md.digest());
String result = no.toString(16);
while(result.length() < 32) {
result = "0" + result;
}
System.out.println("MD5: " + result);
return 0;
}
public static void main(String[] args) {
try {
secure_copy(args[0], args[1]);
} catch (Exception e) {
System.out.println("Error: " + e.getMessage());
}
}
}
Output from source file: (Correct)
MD5: 503ea121d2bc6f1a2ede8eb47f0d13ef
The file from the copy function, checked via md5sum
md5sum file.mov
56883109c28590c33fb31cc862619977 file.mov
Upvotes: 0
Views: 198
Reputation: 153
On every read from the InputStream, the code is continually changing the data to calculate the hash of. Instead of calling md.update(buf, 0, numRead);
within the loop, it should read the entire file into a byte[]
and then call md.update(entireFileByeArray)
once. (See this answer for a way to find the appropriate array size ahead of opening the file.)
Upvotes: 0
Reputation: 111219
You are writing the entire buffer to the output file, not just the portion that has data from the latest read. The fix is simple:
if (numRead > 0) {
md.update(buf, 0, numRead);
outFile.write(buf, 0, numRead);
}
Upvotes: 2