Ildar Allayarov
Ildar Allayarov

Reputation: 23

Hashing of strings in Java

My question is concerning the code, which is producing the hash values for strings, summing 4 bytes at a time. It's completely working, but I can't understand some of lines of this code, namely the idea which is performed in some lines. Therefore I need the help of some of yours who is rather familiar with hashing.

Well this is the full code:

long sfold(String s, int M) {
 int intLength = s.length() / 4;
 long sum = 0;
 for (int j = 0; j < intLength; j++) {
   char c[] = s.substring(j * 4, (j * 4) + 4).toCharArray();
   long mult = 1;
   for (int k = 0; k < c.length; k++) {
 sum += c[k] * mult;
 mult *= 256;
   }
 }

 char c[] = s.substring(intLength * 4).toCharArray();
 long mult = 1;
 for (int k = 0; k < c.length; k++) {
   sum += c[k] * mult;
   mult *= 256;
 }

 return(Math.abs(sum) % M);

}

Here each char value is converted to long integer type, summing the result on each iteration of the for-loop. These 2 doubtful lines of code which I mentioned above are as follows:

sum += c[k] * mult;
mult *= 256;

Well, I can understand the whole code, except these 2 lines...

1) Why we need variable 'mult'? Is it probably a usage of multiplication method for hashing?

2) Why we multiply 'mult' exactly by 256 on each iteration? What is 256 in this case?

If some of you have faced with this code or you know the idea, which is performed in these lines, please help me to understand it too :)

Upvotes: 2

Views: 316

Answers (3)

darijan
darijan

Reputation: 9775

Multiplying by 256 is actually shifting bits to the left by 8 positions (1 byte).

So, what it does is:

  • it keeps the bits of the first character in lowest 8 bits (first byte),
  • next character's 8 bits are in the next 8 positions (next byte), and so on, up to four.

I'll give an example for 4-bit system (we would multiply by 16 in that case):

c[0] = 1101
c[1] = 1001 
c[2] = 0010 
c[3] = 0110

it builds long sum whose bits look like:

0110 0010 1001 1101
c[3] c[2] c[1] c[0] 

Upvotes: 0

Ayman
Ayman

Reputation: 11585

The code basically goes one byte at a time. Each byte is 8 bits, or the number 256. In other words, multiplying by 256 is like shifting the value to the left by one byte.

Upvotes: 0

Desert
Desert

Reputation: 2303

Because of the fact that c[k] is char it has size of 8 bits and 8 bits is exactly 256 possible numbers. So for example we have char[] c = new char[]{'a, 'b', 'c', 'd'}, here 'a' in a bit wise form would look something like 10000001 and b something like 10000010 and so on. Now the question is how we form sum, firstly we just take our a representation in bit wise, so sum become 10000001, next we take b in bit wise form and multiply it to 256 which is in fact just bit wise shift on 8 bits to left, that means that 'b' * 256 is the same as 10000001 * 100000000 = 1000000100000000 (256 in bit form is 100000000) and now when we add this 'b' * 256 with previous sum, that means just substituting last 8 bits with a bit form. The same thing happens next.

So in the end we just receive a number which is bit wise concatenation of our previous chars (for example 10000001 10000010 10000011 10000100).

I hope that would help.

Upvotes: 1

Related Questions