Reputation: 18712
What is the easiest way to check (in a unit test) whether binary files A and B are equal?
Upvotes: 19
Views: 14715
Reputation: 660
Apache Commons IO solution:
boolean sameFileContent = FileUtils.contentEquals(file1, file2)
From documentation:
Compares the contents of two files to determine if they are equal or not. Checks that
Reference: https://commons.apache.org/proper/commons-io/javadocs/api-2.4/org/apache/commons/io/FileUtils.html
Upvotes: 0
Reputation: 3048
Read the files in (small) blocks and compare them:
static boolean binaryDiff(File a, File b) throws IOException {
if(a.length() != b.length()){
return false;
}
final int BLOCK_SIZE = 128;
InputStream aStream = new FileInputStream(a);
InputStream bStream = new FileInputStream(b);
byte[] aBuffer = new byte[BLOCK_SIZE];
byte[] bBuffer = new byte[BLOCK_SIZE];
do {
int aByteCount = aStream.read(aBuffer, 0, BLOCK_SIZE);
bStream.read(bBuffer, 0, BLOCK_SIZE);
if (!Arrays.equals(aBuffer, bBuffer)) {
return false;
}
}
while(aByteCount < 0);
return true;
}
Upvotes: 4
Reputation: 31648
There's always just reading byte by byte from each file and comparing them as you go. Md5 and Sha1 etc still have to read all the bytes so computing the hash is extra work that you don't have to do.
if (file1.length() != file2.length()) {
return false;
}
try( InputStream in1 = new BufferedInputStream(new FileInputStream(file1));
InputStream in2 = new BufferedInputStream(new FileInputStream(file2));
) {
int value1, value2;
do {
//since we're buffered, read() isn't expensive
value1 = in1.read();
value2 = in2.read();
if(value1 != value2) {
return false;
}
} while(value1 >= 0);
// since we already checked that the file sizes are equal
// if we're here we reached the end of both files without a mismatch
return true;
}
Upvotes: 7
Reputation: 2373
Since Java 12 you could also use the Files.mismatch
method JavaDoc. It will return -1L
if the files are the same.
Upvotes: 3
Reputation: 198014
Are third-party libraries fair game? Guava has Files.equal(File, File)
. There's no real reason to bother with hashing if you don't have to; it can only be less efficient.
Upvotes: 15
Reputation: 2373
If you want to avoid dependencies you can do it using quite nicely with Files.readAllBytes and Assert.assertArrayEquals
Assert.assertArrayEquals("Binary files differ",
Files.readAllBytes(Paths.get(expectedBinaryFile)),
Files.readAllBytes(Paths.get(actualBinaryFile)));
Note: This will read the whole file so it might not be efficient with large files.
Upvotes: 4
Reputation: 663
With assertBinaryEquals.
public static void assertBinaryEquals(java.io.File expected,
java.io.File actual)
http://junit-addons.sourceforge.net/junitx/framework/FileAssert.html
Upvotes: 4
Reputation: 17595
I had to do the same in a unit test too, so I used SHA1 hashes to do that, to spare the the calculation of the hashes I check if the files sizes are equal first. Here was my attempt:
public class SHA1Compare {
private static final int CHUNK_SIZE = 4096;
public void assertEqualsSHA1(String expectedPath, String actualPath) throws IOException, NoSuchAlgorithmException {
File expectedFile = new File(expectedPath);
File actualFile = new File(actualPath);
Assert.assertEquals(expectedFile.length(), actualFile.length());
try (FileInputStream fisExpected = new FileInputStream(actualFile);
FileInputStream fisActual = new FileInputStream(expectedFile)) {
Assert.assertEquals(makeMessageDigest(fisExpected),
makeMessageDigest(fisActual));
}
}
public String makeMessageDigest(InputStream is) throws NoSuchAlgorithmException, IOException {
byte[] data = new byte[CHUNK_SIZE];
MessageDigest md = MessageDigest.getInstance("SHA1");
int bytesRead = 0;
while(-1 != (bytesRead = is.read(data, 0, CHUNK_SIZE))) {
md.update(data, 0, bytesRead);
}
return toHexString(md.digest());
}
private String toHexString(byte[] digest) {
StringBuilder sha1HexString = new StringBuilder();
for(int i = 0; i < digest.length; i++) {
sha1HexString.append(String.format("%1$02x", Byte.valueOf(digest[i])));
}
return sha1HexString.toString();
}
}
Upvotes: 1