Jay Namon
Jay Namon

Reputation: 147

How SHA-1 hashing works on Java on Android

I searched around for how to hash device identifiers and stumbled on the following code.

I don't really understand what it's doing.

  1. Why do I need to urlEncode the device id ?
  2. Why do I need to hash the bytes, couldn't I just do that on a String ?
  3. Why do I need to convert it to a BigInteger ?
  4. Why do I need to shift bits to get a String with the hashed id ?

Can anyone explain what's going on line by line? I hope this will help other people understand this snippet that's getting passed around in blogs and forums, too.

String hashedId = "";

String deviceId = urlEncode(Secure.getString(context.getContentResolver(), Secure.ANDROID_ID));

try {
    MessageDigest digest = MessageDigest.getInstance("SHA-1");

    byte bytes[] = digest.digest(deviceId.getBytes());

    BigInteger b = new BigInteger(1, bytes);
    hashedId = String.format("%0" + (bytes.length << 1) + "x", b);

} catch (NoSuchAlgorithmException e) {
    //ignored
}

return hashedId;

Upvotes: 0

Views: 1048

Answers (4)

Nayan Rath
Nayan Rath

Reputation: 303

You Can Use this Code also :

         public class sha1Calculate {

        public static void main(String[] args)throws Exception
        {
             File file = new File("D:\\Android Links.txt");
            String outputTxt= "";
            String hashcode = null;

            try {

                FileInputStream input = new FileInputStream(file);

                ByteArrayOutputStream output = new ByteArrayOutputStream ();
                byte [] buffer = new byte [65536];
                int l;

                while ((l = input.read (buffer)) > 0)
                    output.write (buffer, 0, l);

                input.close ();
                output.close ();

                byte [] data = output.toByteArray ();


                    MessageDigest digest = MessageDigest.getInstance( "SHA-1" ); 

                byte[] bytes = data;

                digest.update(bytes, 0, bytes.length);
                bytes = digest.digest();

                StringBuilder sb = new StringBuilder();

                for( byte b : bytes )
                {
                    sb.append( String.format("%02X", b) );
                }

                    System.out.println("Digest(in hex format):: " + sb.toString());


            }catch (FileNotFoundException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } catch (NoSuchAlgorithmException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

        }

Upvotes: 0

Mark Peters
Mark Peters

Reputation: 81154

Why do I need to urlEncode the device id ?

Why do I need to hash the bytes, couldn't I just do that on a String ?

Most hashing algorithms, including SHA-1, work on binary data as input (i.e. bytes). Strings themselves don't have a specific binary representation; it changes depending on the encoding.

The line of code they provide uses the default encoding, which is a bit fragile. I would prefer to see something like

byte bytes[] = digest.digest(deviceId.getBytes(Charset.forName("UTF-8")));

Why do I need to convert it to a BigInteger ?

This is being used for convenience to help with the conversion to a hexadecimal representation.

Why do I need to shift bits to get a String with the hashed id ?

The format String being used is %0Nx, which causes the string to be zero-padded to N characters. Since it takes two characters to represent a byte in hexadecimal, N is bytes*2, which is the result as bytes << 1.

I don't really understand why you wouldn't just include Guava for Android and use the Hashing builder:

String hash = Hashing.sha1().hashString(deviceId, Charsets.UTF_8).toString();

It's one line and doesn't throw checked exceptions.

Upvotes: 3

Graham Borland
Graham Borland

Reputation: 60701

You need to hash the bytes, rather than the String, so that you're hashing the character data rather than the String object, which may have unpredictable internal state for a given sequence of characters.

It's converted to BigInteger so it can be consistently formatted with two hex digits per byte. (This is why the length is multiplied by two with the left shift.)

Basically, the answer to all of your questions is: so that you get reliable, repeatable results, even on different platforms.

Upvotes: 1

Adam Liss
Adam Liss

Reputation: 48330

About the bit-shifting: shifting left by one is equivalent to multiplying by 2. Each byte in the string is represented by 2 hex characters, so the resulting string will be twice as long as the number of bytes in the hash.

This will create a format string that looks something like %032x, which will print an integral value as a zero-padded 32-character string.

Upvotes: 1

Related Questions