Hashing string into integer in Java applet - how does it work?

Question

I'm now working on a challenge from website http://www.net-force.nl/challenges/ and I stand before an interesting problem I can't solve. I'm not asking for the whole result (as it would be breaking the rules), but I need help with the programming theory of hash function.

Basically, it's based on Java applet with one textfield, where user has to enter the right password. When I decompile the .class file, one of the methods I get is this hash method.

string s contains entered password, immediately given to the method:

private int hash(string s)
{
   int i = 0;
   for(int j = 0; j < s.length(); j++)
   i += s.charAt(j);

   return i;
}

The problem is that the method returns integer as the "hash", but how can characters be converted to integer at all? I got an idea that maybe the password is a number, but it doesn't lead anywhere at all. Another idea talks about ASCII, but still nothing.

Thanks for any help or tips.

Jon Skeet · Accepted Answer

The trick is that it's converting each character into an integer. Each character (char) in Java is a UTF-16 code unit. For the most part¹, you can just think of that as each character is mapped to a number between 0 and 65535 inclusive, in a scheme called Unicode. For example, 65 is the number for 'A', and if you'd typed in the Euro symbol, that would map to Unicode U+20AC (8364).

Your hashing function basically adds together the numbers for each character in the string. It's a very poor hash (in particular it gives the same results for the same characters regardless of ordering), but hopefully you'll get the idea.

¹ Things get trickier when you need to bear in mind surrogate pairs, where a single Unicode character is actually made up of two UTF-16 code units - that's for characters with a Unicode number of more than 65535. Let's stick to the basics for the moment though :)

Hashing string into integer in Java applet - how does it work?

Answers (2)

Related Questions