Reputation: 15092
we use md5 as a hashing algorithm in many parts of our code.
security in this context is NOT an issue. we just use md5 as a method of generating a unique identifier to store various items in apc etc.
collisions are an issue. although unlikely, it would cause some major issues.
anyone want to suggest something lighter on the cpu?
thanks.
we have just done some testing with md5 vs crc32.
using the following snippet:
<?
$start=microtime(true);
for($i=1;$i<=1000000;$i++){
md5(rand(10000000,99999999)); <--- crc32 tested here too.
}
$end=microtime(true);
echo $end-$start."\n";
?>
there results are as follow:
md5:
1.4991459846497
1.7893800735474
1.4672470092773
crc32:
0.97880411148071
0.94331979751587
0.93316197395325
so it would appear crc32 is about 1/3 faster then using md5.
Upvotes: 9
Views: 6860
Reputation:
It would be very hard (almost impossible, really) to beat CRC32 or a variant as it is so trivial (a rolling XOR across a single 32-bit word). Furthermore, since crc32
cheats and jumps to native code, unless the other solution does that as well, chances are the native CRC32 implementation won't be beaten.
However, it also has a much smaller space than MD5. Is the trade-off okay? CRC32's are usually only for basic error detection/framing... (It really is a "checksum" and not a "hashing" function for practical conversation purposes.)
Happy coding.
Also your numbers only show a 2/3 reduction ;-) In any case, I suspect this is not the main bottleneck and would highly recommend using an algorithm that will work -- be it MD5 or SHA1 or other. MD5 is only slightly computationally less expensive than SHA1 (it's within an order of magnitude), but it is possible that the implementation plays a factor. Run benchmarks on this as well if desired...
Upvotes: 8
Reputation: 679
One of the comments in the php online userguide shows that md4 is the fastest. Then md5 followed by crc32, followed by sha1.
My own tests verify this. Very strange that your test should show otherwise. I tried your snippet as well and got the opposite results. Perhaps it's machine or PHP version dependent.
http://php.net/manual/en/function.hash-algos.php
Upvotes: 1
Reputation: 6753
Well, you can just use the variable name as a hash if you want to go light on the CPU. IF you want to convert a string to an int, just treat it as base 256 and convert it to an int.
Maybe you can try sha which produces a 40 char long string for any input.
Upvotes: 0