Reputation: 505
I have a MD5 hash (for example "5d41402abc4b2a76b9719d911017c592") and I want to find another string that has the same hash. So far I have created two algorithms (one in Java and another in C#) but they run really slow. At the moment I can only process around 100,000 hashes per second. Are there any other algorithms I should use to speed things up?
This is an example of the algorithm I'm currently using in Java (I have the original hash stored in originalHash, then I generate hashes of other strings that are just numbers and compare the hashes):
import java.security.*;
import java.math.*;
public class b {
public static void main(String args[]) throws Exception{
String s="Hello";
MessageDigest m=MessageDigest.getInstance("MD5");
m.update(s.getBytes(),0,s.length());
String originalHash = new BigInteger(1,m.digest()).toString(16);
System.out.println("MD5: " + originalHash);
for (long i = 0; i < 9223372036854775807L; i++)
{
String iString = i + "";
m.update(iString.getBytes(),0,iString.length());
iString = new BigInteger(1,m.digest()).toString(16);
if (originalHash.equals(iString))
{
System.out.println("Found MD5: " + iString);
break;
}
if (i%1000000 == 0)
{
System.out.println("Count: " + (long)i/1000000 + "M");
System.out.println("Sample Hash: " + iString);
}
}
}
}
Upvotes: 2
Views: 1575
Reputation: 1970
When you are looking at big number crunching and performance (latency) is concern Stack based VM like Java/.Net are not a good option. To get this done in Java implement algorithm in C++ and call it through Java Native Interface. In .Net world use unsafe code to access bytes through pointers. Definitely in both cases you will have to take care of stability/memory management as there would be no Platform/Framework taking care of it for you.
Upvotes: 0
Reputation: 11445
You need to take a peek at GPU programming. You can run thousands of threads to check your hash against your sequentially incrementing number at a time, and the GPU model fits your problem definition well. One example of a hash cracker is oclHashCat.
Otherwise, you could distribute your computation across multiple machines to run the hashes in parallel, like bringing up a hadoop cluster.
Another option is to pre-compute all possible hashes using rainbow tables, and just doing a lookup.
Of course, you could just do a "google" for "md5 hash lookup" and just input your existing MD5 hash and get the string result.
If you are trying to find a random collision between your chosen input and any other value, well... you may be waiting a bit.
Upvotes: 2