iffi
iffi

Reputation: 53

why these two sources calculate different sha-1 sums

The following snippets are both supposed to calculate sha-1 sum. But for the same file they calculate different sha-1 sums.

//snippet1
byte[] byteArr = new byte[(int) uploadedFile.getLength()];
try {
 stream = new BufferedInputStream(uploadedFile.getInputStream());
 stream.read(byteArr);
 stream.close(); 
} catch (IOException e) {
 e.printStackTrace();
}
md = MessageDigest.getInstance("SHA-1"); 
byte[] sha1hash = new byte[40];
md.update(byteArr, 0, byteArr.length);
sha1hash = md.digest();

//snippet2
md = MessageDigest.getInstance("SHA-1");
InputStream is = uploadedFile.getInputStream();
try {
 is = new DigestInputStream(is, md);
} finally {
 try {
  is.close();
 } catch (IOException e) {
  e.printStackTrace();
 }
}
sha1hash = md.digest();

Can you explain why?

Upvotes: 2

Views: 535

Answers (2)

Michael Borgwardt
Michael Borgwardt

Reputation: 346536

You have a bug here:

 stream = new BufferedInputStream(uploadedFile.getInputStream());
 stream.read(byteArr);
 stream.close(); 

The read() method does not automatically fill the array that's passed into it - it will read an arbitrary number of bytes and return that number. You have to loop and add up the returned byte count till the array is filled.

Almost everyone gets that wrong the first time, but it's one reason why the input stream based method is better (the other being that for large files, you definitely don't want to keep them in memory completely).

Upvotes: 3

Joachim Sauer
Joachim Sauer

Reputation: 308269

Both of your snippets are buggy:

  • The first snipped reads some (effectively random) amount of bytes from the file and is in no way guaranteed to read the whole file (read the JavaDoc of read() for details).

  • The second snipped doesn't read anything at all from the InputStream and therefore returns the SHA-1 of the empty stream (0 bytes read).

Upvotes: 12

Related Questions