Reputation: 402
My goal is to attack Java's HashMap with several strings (around 10000 or more), that produce the same hash. I used the script available on this link, translated it to Python3 (so that I can produce the strings on my terminal itself) and ran on my machine to produce around 177147 strings. All these strings produce the same hash when String.hashCode() method is called. From this link, it is evident that the time complexity would be O(N*N) if random read and write is performed on the HashMap. It should take more time if N is huge (in this case, N is greater than 10000). But surprisingly, it is running in less than 2s. I want it to take more than 10s. The following are the scripts I am using.
# Python3 script to generate strings that produce
# same hash with Java's String.hashCode() method
def combs(n,arr,str,arr_rtn):
if n==1:
for j in range(3):
arr_rtn[str + arr[j]] = 0
else:
for j in range(3):
combs(n-1,arr,str+arr[j],arr_rtn)
arr_src = ['at','bU','c6']
str_tmp = ''
arr_rtn = dict()
t = combs(11,arr_src,str_tmp,arr_rtn)
keys = list(arr_rtn.keys())
print(len(keys))
print(*keys, sep = '\n')
// Java code to insert the generated
// strings into HashMap
import java.util.*;
class Main {
public static void main(String[] args) {
Scanner sc = new Scanner(System.in);
int n = sc.nextInt();
HashMap<String, Integer> map = new HashMap<>();
for (int i = 0; i < n; i++) {
String s = sc.next();
map.put(s, s.hashCode());
// Should take more time since all strings
// hash to same value
}
System.out.println(map.size());
sc.close();
}
}
My only goal is to attack the HashMap with strings that produce same hash, so that it takes more than 20s (at least) to execute.
Upvotes: 1
Views: 630
Reputation: 718758
As Tim says, from Java 8 onwards, HashMap
will replace the simple linked list hashchain with a balanced binary tree. This happens:
If the key class implements Comparable<?>
(as String
does) then the compareTo
method is used for the tree ordering.
That means that the worst-case for a HashMap<String>
with N
string keys that are engineered to collide will be O(logN)
rather than O(N)
.
In short, a collision attack against HashMap
won't be effective in Java 8 or later. (But you could try running your test attack on Java 7 or older.)
If you need more information on how HashMap
's list vs tree stuff works, it is extensively commented in the source code. Find the source code that matches the version you are interested in ... and read it.
Note that the question you cited as your source of information ( java HashMap collision ) was asked and answered in 2013. Java 8 was released in March 2014.
Upvotes: 0
Reputation: 520968
Actually, Java's HashMap
treats collisions by placing multiple values mapping to the same bucket in a linked list. Hence, the worst case performance for search in a Java HashMap
is O(N)
, where N
is the number of elements in the map.
Actually, in more recent versions of Java, HashMap
deals with collisions by placing all same bucket elements in a tree. This means that the worst case search performance becomes O(h)
, where h
is the height of the tree.
Upvotes: 1