Sina Karvandi
Sina Karvandi

Reputation: 1102

Use a combination of SHA1+MD5

I'm trying use a secure way to create checksum for files (Larger than 10GB !).

SHA256 is secure enough for me but this algorithm is so process expensive and it is not suitable. Well I know that both SHA1 and MD5 checksums are insecure through the collisions.

So I just think the fastest and the safest way is combining MD5 with SHA1 like : SHA1+MD5 and I don't think there is way to get file (Collision) with the same MD5 and SHA1 both at a same time .

So is combining SHA1+MD5 secure enough for file checksum? or is there any attack like collision for it ?

I use c# mono in two way (Bufferstream and without Bufferedstream)

    public static string GetChecksum(string file)
    {
        using (FileStream stream = File.OpenRead(file))
        {
            var sha = new SHA256Managed();
            byte[] checksum = sha.ComputeHash(stream);
            return BitConverter.ToString(checksum).Replace("-", String.Empty);
        }
    }

    public static string GetChecksumBuffered(Stream stream)
    {
        using (var bufferedStream = new BufferedStream(stream, 1024 * 32))
        {
            var sha = new SHA256Managed();
            byte[] checksum = sha.ComputeHash(bufferedStream);
            return BitConverter.ToString(checksum).Replace("-", String.Empty);
        }
    }

Update 1: I mean SHA1 hash + MD5 hash. First calculate SHA1 of file then calculate MD5 of file then add this two sting together.

Update 2 :

As @zaph mentioned I implement my code(C# MONO) again according what I read here but it doesn't make my code as fast as he said ! It makes my speed for a 4.6 GB file from (approximate) 12mins to about 8.~ mins but sha1+md5 takes me less than 100 secs for this file. So I still think it isn't right to use SHA256 instead.

Upvotes: 5

Views: 3266

Answers (2)

zaph
zaph

Reputation: 112857

There should be only a small difference between SHA-256 and a combination of MD5+SHA1.

The only way to know is to benchmark:

On my desk top:
SHA-256: 200 MB/s
MD5: 470 MB/s
SHA1: 500 MB/s (updated, previously incorrect)
MD5+SHA1 240 MB/s

These times are only for the hashing, disk read time is not included. The tests were done with a 1MB buffer and averaged over 10 runs. The language was "C" and the library used was Apple's Common Crypto. The cpu was a 2.8 GHz Quad-Core Intel Xeon (2010 MacPro, my laptop is faster).

In the end it is 23% faster to use the combined MD5+SHA1.

Note: Most Intel processors have instruction that can be used to make crypto operations faster. Not all implementations utilize these instructions.

YOumight try a native implementations such as sha256sum.

Upvotes: 2

Mingky
Mingky

Reputation: 41

If by SHA1+MD5 you mean hashing with SHA-1 first and then using that digest at input into MD5, then you are not eliminating collisions completely, just potentially reducing the chance of one occurring.

Both SHA-1 and MD5 are fixed length cryptographic hash functions, and according to the Pigeonhole Principle collisions are bound to occur if the message length is greater than the digest size. There are two instances of this in your use case:

  • When you hash your arbitrary-length message with SHA-1
  • When the 160-bit SHA-1 digest is used as input to MD5

My point is that collisions will always exist. However, the probability of finding one is exceedingly small. If the sole purpose is for file integrity, SHA-1 will do the job just fine on its own.

Related:

What checksum algorithm should I use?

Is MD5 still good enough to uniquely identify files?

Upvotes: 2

Related Questions