Reputation: 2392
I'm now working on a challenge from website http://www.net-force.nl/challenges/ and I stand before an interesting problem I can't solve. I'm not asking for the whole result (as it would be breaking the rules), but I need help with the programming theory of hash function.
Basically, it's based on Java applet with one textfield, where user has to enter the right password. When I decompile the .class file, one of the methods I get is this hash method.
string s
contains entered password, immediately given to the method:
private int hash(string s)
{
int i = 0;
for(int j = 0; j < s.length(); j++)
i += s.charAt(j);
return i;
}
The problem is that the method returns integer as the "hash", but how can characters be converted to integer at all? I got an idea that maybe the password is a number, but it doesn't lead anywhere at all. Another idea talks about ASCII, but still nothing.
Thanks for any help or tips.
Upvotes: 2
Views: 542
Reputation: 54094
The hash function you present is the simplest hashing function you could possibly right for a string.
It is easy to implement and really fast in its computation.
It is problematic though since it doesn't distributes the input well.
Assuming ASCII chars the hash can take values from 0 to 1016 since an ASCII char is between 0 - 127.
I.e. each character in the string is "treated" as its ASCII equivalent (For more advance analysis check @John's answer).
Anyway you should note that strings containing the same characters but in different order map to the same hash value with this function.Perhaps this is of interest to you in the challenge you are trying to attack (??)
Upvotes: 0
Reputation: 1503180
The trick is that it's converting each character into an integer. Each character (char
) in Java is a UTF-16 code unit. For the most part1, you can just think of that as each character is mapped to a number between 0 and 65535 inclusive, in a scheme called Unicode. For example, 65 is the number for 'A', and if you'd typed in the Euro symbol, that would map to Unicode U+20AC (8364).
Your hashing function basically adds together the numbers for each character in the string. It's a very poor hash (in particular it gives the same results for the same characters regardless of ordering), but hopefully you'll get the idea.
1 Things get trickier when you need to bear in mind surrogate pairs, where a single Unicode character is actually made up of two UTF-16 code units - that's for characters with a Unicode number of more than 65535. Let's stick to the basics for the moment though :)
Upvotes: 4