hengxin
hengxin

Reputation: 1999

How to generate HashCode in order to call Hashing#consistentHash of google/guava?

In google/guava@GitHub, the class Hashing has implemented consistent hashing@wiki. The method consistentHash requires a HashCode object:

public static int consistentHash(HashCode hashCode, int buckets) {
  return consistentHash(hashCode.padToLong(), buckets);
}

I am implementing a prototype distributed key-value storage, and want to partition the keyspace by custom Row key and Column key.

public int locateNodeIndexFor(Row r, Column c, int buckets) {
  HashCode hashCode = // How to generate a HashCode based on @param r and @param c?
  return Hashing.consistentHash(hashCode, buckets);
}

Here class Row (and class Column) is simply a wrapper of a String field, and has its own hashCode() method. My question is how to generate a HashCode based on @param r and @param c in locateNodeIndexFor in order to call Hashing#consistentHash?

Upvotes: 2

Views: 1513

Answers (2)

Magnus
Magnus

Reputation: 8300

You need to somehow generate a hash of your row and column objects, you can do this by serializing thier data and using one of the hash-functions from the Hashing class, or you could use a faster hash implementation and build a HashCode with one of the factory methods fromInt, fromLong, fromBytes or fromString.

You could simply use your IDE to generate a Java hashCode method for your objects, then use the HashCode.fromInt() factory method to build the google HashCode object.
This will be much faster then serializing a string and using a cryptographic hash.
Whichever option you go with, you need to make sure that the hashcode you build will be the same when the object has the same data in it.
For example hashing the result of the toString method wont work, unless you are overriding the method and providing a textual representation of all your objects data. If you don’t override it you will just get the objects identifier which will always be different between each instance, which would default the purpose of "consistent" hashing.

Upvotes: 0

Louis Wasserman
Louis Wasserman

Reputation: 198211

Use another HashFunction to hash the row and column, e.g.

HashCode h = Hashing.murmur3_32().newHasher()
  .putString(row.getString(), StandardCharsets.UTF_8)
  .putString(col.getString(), StandardCharsets.UTF_8)
  .hash()

Upvotes: 5

Related Questions