faican
faican

Reputation: 45

Map words to single characters

I'm building an hash function which should map any String (max length 100 characters) to a single [A-Z] character (I'm using it for sharding purposes).

I came up with this simple Java function, is there any way to make it faster?

public static final char stringToChar(final String s) {
    long counter = 0;
    for (char c : s.toCharArray()) {
        counter += c;
    }
    return (char)('A'+(counter%26));
}

Upvotes: 3

Views: 182

Answers (1)

Pado
Pado

Reputation: 1637

A quick trick to have an even distribution of the "shards" is using an hash function.

I suggest this method that uses the default java String.hashCode() function

public static char getShardLabel(String string) {
    int hash = string.hashCode();
    // using Math.flootMod instead of operator % beacause '%' can produce negavive outputs
    int hashMod = Math.floorMod(hash, 26);
    return (char)('A'+(hashMod));
}

As pointed out here this method is considered "even enough".

Based on a quick test it looks faster than the solution you suggested.
On 80kk strings of various lengths:

  • getShardLabel took 65 milliseconds
  • stringToChar took 571 milliseconds

Upvotes: 6

Related Questions