Ahmadou Kassoum
Ahmadou Kassoum

Reputation: 89

Generating three distinct strings with equal hashes using the default hash function in C#

I am trying to generate three distinct strings, A, B, and C, such that their hash values are all equal using the default hash function provided by the programming language. Specifically, I need to ensure that A is not equal to B, B is not equal to C, and A is not equal to C.

I have tried several approaches but haven't been successful in finding a solution yet. I am seeking assistance to implement a method or algorithm that can fulfill these requirements. It's crucial that the hash values of all three strings are the same.

Here is my implementation, however, it is still incomplete because I have a collision with the first two strings but not with the third one.

var dictionary = new Dictionary<int, string>();

  int collusionCounter = 0, stringCounter = 0;
  string myString;
  int hash = 0;

  List<string> myList = new List<string>();


  while (true)
  {
    stringCounter++;
    myString = stringCounter.ToString();

    try
    {
      hash = myString.GetHashCode();
      dictionary.Add(hash, myString);
    }
    catch (Exception)
    {
      if (dictionary.ContainsKey(hash))
      {
        myList.Add(myString);
        collusionCounter++;
        if (collusionCounter == 2)
        {
          break;
        }
      }
      continue;
    }
  }

  var A = myList[0];
  var B = myList[1];
  var C = dictionary[hash];

  Console.WriteLine($"{A.GetHashCode()} {B.GetHashCode()} {C.GetHashCode()}");

And hier is a result of implementation :

374545419 1954295680 1954295680

I would appreciate any guidance or insights on how to achieve this task effectively. Thank you!

Upvotes: 0

Views: 280

Answers (1)

Theodor Zoulias
Theodor Zoulias

Reputation: 43384

String hashcodes in .NET are not stable, meaning that a specific string has different hashcode each time you run a program. Hashcodes are stable only during a single execution of a program. This .NET feature probably undermines what you are trying to do, but let's assume that string hashcodes in .NET were stable, and try to find an answer to your question under this assumption.

You might be able to find 3 different strings having the same hashcode mathematically, by knowing the algorithm that produces the hashcode and reverse-engineering it. This might not be unrealistic because hashcodes are not meant to be cryptographicaly secure, so reverse-engineering them might be feasible. But I can't help you in this direction because I am not a mathematician.

I'll suggest a brute-force probabilistic approach for solving this problem. .NET hashcodes are 32 bit numbers, so it's guaranteed that you'll get at least one collision if you have a set of 2 ^ 32 + 1 (4,294,967,297) elements. You will need a generator of strings that can produce more unique strings than this number. A good candidate seems to be a generator of all permutations of 8 lower-case Latin characters, with a population space of 26 ^ 8 = 208,827,064,576‬ strings. On average ~48 strings will share the same hashcode, so you will be very unlucky if you pick randomly a string that doesn't collide with 2 others. The algorithm to find the 3 strings goes like this:

  1. Add the first generated string in a list a, and store its hashcode in a variable b.
  2. Start a loop where in each iteration you generate the next string, and compare its hashcode with the b. If the values are equal add the generated string in the list a.
  3. Exit the loop when you have 3 strings in the list a. These strings are different, and they share the same hashcode.

I would expect to have your result after about 8 billion iterations of the loop.

Upvotes: 5

Related Questions