codewarrior
codewarrior

Reputation: 1034

MD5 of file downloaded from database, from a JSONObject

My requirement is to compare the MD5 hashes of a file on the local disk and a file downloaded from a database. The file is stored on SQL Server in a VARBINARY(MAX) column. The file can be any type. I'm currently testing with a PDF file. I get the file from the database using a HttpPost request. A JSONObject is built using the HttpResponse object. The JSONObject contains the file contents in binary format.

Now I have to compare the MD5 hash of the received binary data against the MD5 hash of the same file on disk. I have written the following code but the MD5 hashes do not match. I think I'm going wrong in simply calculating the MD5 of the downloaded binary contents. Is there a correct way to do this? Thanks in advance.

// Read response from a HttpResponse object 'response'
BufferedReader reader = new BufferedReader(new InputStreamReader(response.getEntity().getContent()));
String line="";
StringBuilder sb = new StringBuilder();
while((line=reader.readLine())!=null) {
      sb.append(line);
}

// construct jsonobject
JSONObject jsonResponse = new JSONObject(sb.toString());

//Read file from disk
FileInputStream fis = new FileInputStream(new File(this.getClass().getResource("C:\\demo.pdf").getPath()));

// Calculate MD5 of file read from disk
String md5Request = org.apache.commons.codec.digest.DigestUtils.md5Hex(fis);

// Calculate MD5 of binary contents. "binfile" is name of key in the JSONObject 
// and binary contents of downloaded file are in its corresponding value field
String md5Response = org.apache.commons.codec.digest.DigestUtils.md5Hex(jsonResponse.getString("binfile"));

Assert.assertEquals("Hash sums of request and response must match", md5Request, md5Response);

When I debug, I see this value against the binfile key in the JSONObject 'jsonResponse'

binfile=[37,80,68,70,45,49,46,52,13,37,-30,-29,-49,-45,13,10,52,48...]

and what follows is a lengthy stream of binary data.

Upvotes: 0

Views: 1257

Answers (2)

Xop777
Xop777

Reputation: 21

It is not a new post but here is a possible solution, as I faced this problem too on python and made a bunch of test to find how to do...

As you treat all data in binary, you need to open the file to compare in binary mode.

My original code that was failing every time to read the correct MD5 checksum:

    with open(filepath, "r") as file_to_check:
        tile_file = file_to_check.read()

Corrected code:

    with open(filepath, "rb") as file_to_check:
        tile_file = file_to_check.read()

Simply adding the b (binary) after the read (r) flag to let python know it need to read the file as binary and now it works.

This might be what will help you find your problem... Hope it helps!

Upvotes: 0

Brian
Brian

Reputation: 3713

OK, in SQL there's a build-in function that looks like this:

select *, 
convert(varchar(50),master.sys.fn_repl_hash_binary(a.BinaryField),2) as 'MD5Hash'
from SomeTable a

You give the fn_repl_hash_binary the name of the binary field you're reading, plus "2" as an argument which tells SQL to calc the value as an MD5; I think "1" is SHA.

And in Java, you can use something like this:

private String getMD5Hash(byte[] bytes) throws java.lang.Exception{
   String s="This is a test";
   MessageDigest m=MessageDigest.getInstance("MD5");
   m.update(bytes,0,bytes.length);
   return new BigInteger(1,m.digest()).toString(16);
}

This should do the trick. Best of luck, CodeWarrior.

Upvotes: 1

Related Questions