Underoos
Underoos

Reputation: 5190

What is the Postgres MD5() equivalent in SQL Servrer?

I'm migrating data from SQL Server 2017 to Postgres 10.5.

In order to compare data consistency, I want to perform hashing on the rows in a table.

This is my approach for that.

I'll issue a query on SQL server table and Postgres table and get list of hashes for all the rows. I'll be having two hash lists.

  1. SQL Server table hash list
  2. Postgres table hash list

I can compare them both and check if the data is consistent.

I found a function to perform hashing on rows in a table in Postgres.

select md5(f::text) from table_name as f

It is returning hashes for all the rows in the table like below.

hash1_for_row1
hash2_for_row2
hash3_for_row3
hash4_for_row4
....

But I couldn't find an equivalent function or something in SQL Server to do the same MD5 hashing.

I've looked at HASHBYTES() in SQL Server 2017 https://learn.microsoft.com/en-us/sql/t-sql/functions/hashbytes-transact-sql?view=sql-server-ver15

But it does the hashing on only one column. That too on a column which is of varchar().

How do I perform the HASHBYTES() function on all the rows and all the columns in a table and not just one column that returns hashes as mentioned above?

Upvotes: 3

Views: 2321

Answers (1)

nvogel
nvogel

Reputation: 25526

HASHBYTES('MD5',x) in MSSQL is the equivalent of MD5(x) in Postgres. Both functions accept only a single value as input.

What you are really asking is how to replicate the behaviour of ::text, which converts an entire table to an array of text values. One problem with this is that the output of casting a row as text in Postgres is very particular to Postgres. The formatting rules could be difficult to reproduce exactly on another platform and any changes in quoting for example would result in a different hash. A better option might be to convert your data to a more standard format like JSON and then hash that result. For example:

MS SQL:

WITH p (ky,val) AS
(SELECT 1,'foo' UNION ALL SELECT 2,'bar')
SELECT j, HASHBYTES('MD5',CAST(j AS VARCHAR(MAX))) AS md5
FROM p AS p1
CROSS APPLY (SELECT * FROM p WHERE p.ky=p1.ky FOR JSON AUTO, WITHOUT_ARRAY_WRAPPER) AS t(j);

{"ky":1,"val":"foo"}    84C700DA7093081E7A800D1790BE09CE
{"ky":2,"val":"bar"}    A364B3F954F1A875540FE361CABFFD2A

PostgreSQL:

WITH p (ky,val) AS
(SELECT 1,'foo' UNION ALL SELECT 2,'bar'),
p1 AS
(SELECT row_to_json(p) as j
FROM p)
SELECT j,md5(j::text) as md5 from p1;

{"ky":1,"val":"foo"}    84c700da7093081e7a800d1790be09ce
{"ky":2,"val":"bar"}    a364b3f954f1a875540fe361cabffd2a

Upvotes: 3

Related Questions