Sai Suman Chitturi
Sai Suman Chitturi

Reputation: 402

Java HashMap Collision attack

My goal is to attack Java's HashMap with several strings (around 10000 or more), that produce the same hash. I used the script available on this link, translated it to Python3 (so that I can produce the strings on my terminal itself) and ran on my machine to produce around 177147 strings. All these strings produce the same hash when String.hashCode() method is called. From this link, it is evident that the time complexity would be O(N*N) if random read and write is performed on the HashMap. It should take more time if N is huge (in this case, N is greater than 10000). But surprisingly, it is running in less than 2s. I want it to take more than 10s. The following are the scripts I am using.

# Python3 script to generate strings that produce
# same hash with Java's String.hashCode() method
def combs(n,arr,str,arr_rtn):
    if n==1:
        for j in range(3):
            arr_rtn[str + arr[j]] = 0
    else:
        for j in range(3):
            combs(n-1,arr,str+arr[j],arr_rtn)

arr_src = ['at','bU','c6']
str_tmp = ''
arr_rtn = dict()

t = combs(11,arr_src,str_tmp,arr_rtn)

keys = list(arr_rtn.keys())

print(len(keys))
print(*keys, sep = '\n')
// Java code to insert the generated
// strings into HashMap
import java.util.*;

class Main {
    public static void main(String[] args) {
        Scanner sc = new Scanner(System.in);
        int n = sc.nextInt();
        HashMap<String, Integer> map = new HashMap<>();
        for (int i = 0; i < n; i++) {
            String s = sc.next();
            map.put(s, s.hashCode());
            // Should take more time since all strings 
            // hash to same value
        }
        System.out.println(map.size());
        sc.close();
    }
}

My only goal is to attack the HashMap with strings that produce same hash, so that it takes more than 20s (at least) to execute.

Upvotes: 1

Views: 630

Answers (2)

Stephen C
Stephen C

Reputation: 718758

As Tim says, from Java 8 onwards, HashMap will replace the simple linked list hashchain with a balanced binary tree. This happens:

  • when a chain's length exceeds a hard-wired threshold (8 in Java 8), and
  • when the array exceeds another hard-wired threshold (64 in Java 8).

If the key class implements Comparable<?> (as String does) then the compareTo method is used for the tree ordering.

That means that the worst-case for a HashMap<String> with N string keys that are engineered to collide will be O(logN) rather than O(N).

In short, a collision attack against HashMap won't be effective in Java 8 or later. (But you could try running your test attack on Java 7 or older.)


If you need more information on how HashMap's list vs tree stuff works, it is extensively commented in the source code. Find the source code that matches the version you are interested in ... and read it.


Note that the question you cited as your source of information ( java HashMap collision ) was asked and answered in 2013. Java 8 was released in March 2014.

Upvotes: 0

Tim Biegeleisen
Tim Biegeleisen

Reputation: 520968

Actually, Java's HashMap treats collisions by placing multiple values mapping to the same bucket in a linked list. Hence, the worst case performance for search in a Java HashMap is O(N), where N is the number of elements in the map.

Actually, in more recent versions of Java, HashMap deals with collisions by placing all same bucket elements in a tree. This means that the worst case search performance becomes O(h), where h is the height of the tree.

Upvotes: 1

Related Questions