user737862
user737862

Reputation: 61

How to count collisions in a list of strings in Java

How may I count all pairs of collisions in a list of Strings using hashcode of each string?

public class HashCollisions {
private static int strLength;
private static int colls;

public static void main(String[] args) {

    String[] strings ={"AaAaAa","AaAaBB","AaBBAa","AaBBBB"};

    strLength=strings.length;
    for (int i = 0; i < strLength - 1; i++) {
        for (int j = i + 1; j < strLength; j++) {
            if (hash(strings[i]) == hash(strings[j]) && !(strings[i].equals(strings[j]))) {
                    colls++;
            }
        }
    }

    System.out.println(colls);

}

private static byte hash(String s) {
    byte[] bytes = s.getBytes();
    byte result = bytes[0];

    for (int i = 1; i < bytes.length; i++) {
        result ^= bytes[i];
    }

    return result;
}

}

Upvotes: 1

Views: 1707

Answers (2)

Marvin
Marvin

Reputation: 14255

You can group the list of strings by their hashCode and then work with the resulting map. As soon as you have more than one value for a given key there is a collision:

public static void main(String[] args) {
    List<String> strings = Arrays.asList("foo", "bar", "AaAa", "foobar",
            "BBBB", "AaBB", "FB", "Ea", "foo");
    Map<Integer, List<String>> stringsByHash = strings.stream()
            .collect(Collectors.groupingBy(String::hashCode));
    for (Entry<Integer, List<String>> entry : stringsByHash.entrySet()) {
        List<String> value = entry.getValue();
        int collisions = value.size() - 1;
        if (collisions > 0) {
            System.out.println(
                    "Got " + collisions + " collision(s) for strings "
                            + value + " (hash: " + entry.getKey() + ")");
        }
    }
}

This prints:

Got 1 collision(s) for strings [foo, foo] (hash: 101574)
Got 1 collision(s) for strings [FB, Ea] (hash: 2236)
Got 2 collision(s) for strings [AaAa, BBBB, AaBB] (hash: 2031744)

Upvotes: 2

Edward Richman
Edward Richman

Reputation: 3

Why don't you use a Set, put every value in your List into the Set, and find the number of collisions by calculating List.size() - Set.size()?

Upvotes: -1

Related Questions