k1komans
k1komans

Reputation: 299

Getting MD5 Hash of File from URL

The result I'm getting is that files of the same type are returning the same md5 hash value. For example two different jpgs are giving me the same result. However, a jpg vs a apk are giving different results.

Here is my code...

public static String checkHashURL(String input) {
    try {
        MessageDigest md = MessageDigest.getInstance("MD5");
        InputStream is = new URL(input).openStream();

        try {
            is = new DigestInputStream(is, md);

            int b;

            while ((b = is.read()) > 0) {
                ;
            }
        } finally {
            is.close();
        }
        byte[] digest = md.digest();
        StringBuffer sb = new StringBuffer();

        for (int i = 0; i < digest.length; i++) {
            sb.append(
                    Integer.toString((digest[i] & 0xff) + 0x100, 16).substring(
                            1));
        }
        return sb.toString();

    } catch (Exception ex) {
        throw new RuntimeException(ex);
    }
}

Upvotes: 3

Views: 5779

Answers (1)

Jon Skeet
Jon Skeet

Reputation: 1501996

This is broken:

while ((b = is.read()) > 0)

Your code will stop at the first byte of the stream which is 0. If the two files have the same values before the first 0 byte, you'll fail. If you really want to call the byte-at-a-time version of read, you want:

while (is.read() != -1) {}

The parameterless InputStream.read() method returns -1 when it reaches the end of the stream.

(There's no need to assign a value to b, as you're not using it.)

Better would be to read a buffer at a time:

byte[] ignoredBuffer = new byte[8 * 1024]; // Up to 8K per read
while (is.read(ignoredBuffer) > 0) {}

This time the condition is valid, because InputStream.read(byte[]) would only ever return 0 if you pass in an empty buffer. Otherwise, it will try to read at least one byte, returning the length of data read or -1 if the end of the stream has been reached.

Upvotes: 5

Related Questions