Leandro Bardelli
Leandro Bardelli

Reputation: 11578

Using MD5 in SQL Server 2005 to do a checksum file on a varbinary

Im trying to do a MD5 check for a file uploaded to a varbinary field in MSSQL 2005.

I uploaded the file and using

SELECT DATALENGTH(thefile) FROM table

I get the same number of bytes that the file has.

But using MD5 calculator (from bullzip) i get this MD5:

20cb960d7b191d0c8bc390d135f63624

and using SQL I get this MD5:

44c29edb103a2872f519ad0c9a0fdaaa

Why they are different if the field has the same lenght and so i assume the same bytes?

My SQL Code to do that was:

DECLARE @HashThis varbinary;
DECLARE @md5text varchar(250);
SELECT  @HashThis = thefile FROM CFile WHERE id=1;

SET @md5text = SUBSTRING(sys.fn_sqlvarbasetostr(HASHBYTES('MD5',@HashThis)),3,32)
PRINT @md5text;

Maybe the data type conversion?

Any tip will be helpful.

Thanks a lot :)

Upvotes: 3

Views: 6836

Answers (2)

Oleg Dok
Oleg Dok

Reputation: 21766

Two options

  1. VARBINARY type without size modifier utilizes VARBINARY(1), so you are hashing the very 1st byte of file, SELECT DATALENGTH(@HashThis) after assignment will bring to you 1
  2. If you use varbinary(MAX) instead - then keep in mind, that HASHBYTES hashes only first 8000 bytes of input

If you want to perform hashing more than 8000 bytes - write your own CLR hash function, for example the file is from my sql server project, it brings the same results as other hash functions outside of sql server:

using System;
using System.Data.SqlTypes;
using System.IO;

namespace ClrHelpers
{
    public partial class UserDefinedFunctions {
        [Microsoft.SqlServer.Server.SqlFunction]
        public static Guid HashMD5(SqlBytes data) {
            System.Security.Cryptography.MD5CryptoServiceProvider md5 = new System.Security.Cryptography.MD5CryptoServiceProvider();
            md5.Initialize();
            int len = 0;
            byte[] b = new byte[8192];
            Stream s = data.Stream;
            do {
                len = s.Read(b, 0, 8192);
                md5.TransformBlock(b, 0, len, b, 0);
            } while(len > 0);
            md5.TransformFinalBlock(b, 0, 0);
            Guid g = new Guid(md5.Hash);
            return g;
        }
    };
}

Upvotes: 4

aF.
aF.

Reputation: 66697

It can be that MD5 Calculator is making the MD5 Hash of file content + other properties (ex: author, last process date, etc.). You may try to do alter these properties and make another hash to see if the result is equal (between before and after using only MD5 Calculator).

Another possibility is about what are you really saving in SQL Server..


So, it's quite clear, MD5 Calculator and SQL Server are hashing different things. What? I give a congratz to who answers it :)

Upvotes: 1

Related Questions