Sanka
Sanka

Reputation: 1324

Is it possible to use hashCode as a content unique ID?

I need to compare two large strings. Rather than using an equals method like that, is there a way like a hashCode or something which generates a unique id for String? That is because my String is very large. Also, I need distinct content unique Id. Is that possible to use hashCode in String for my purpose.

Upvotes: 0

Views: 390

Answers (1)

supercat
supercat

Reputation: 81115

The purpose of hashCode is to provide a quick means of identifying most of the circumstances where two objects would compare unequal. A hash function which has a 1% false positive rate would for most purposes be considered superior to one that has a 0% false positive rate, but takes twice as long.

There are some hashing functions which are designed for use as "digests", such that two different strings of arbitrary length would be very unlikely to have the same digest. In order to be very effective, however, digests need to be much larger than a 32-bit hashcode value. A well-designed 64-byte (512 bit) digest would generally be adequate to guard strings of any length well enough that one would be more likely to get struck by lightning twice on the same weekend as one wins five state lotteries than to find two different strings that yield the same digest. The cost of computing a good digest function for a string would be much greater than that of comparing the string to another string, but if each string will be compared against many other strings, computing each digest function once and comparing it to the digests of every other string may offer a major performance win.

Upvotes: 2

Related Questions