Charles Joseph
Charles Joseph

Reputation: 41

German Umlaut hash - SHA256 on SQL server

I am facing a problem when applying SHA256 hash to German Umlaut Characters.

--Without Umlaut
SELECT CONVERT(VARCHAR(MAX), HASHBYTES('SHA2_256','o'), 2) as HASH_ID
Sql server Output 65C74C15A686187BB6BBF9958F494FC6B80068034A659A9AD44991B08C58F2D2

This is matching to the output in https://www.pelock.com/products/hash-calculator

--With Umlaut

SELECT CONVERT(VARCHAR(MAX), HASHBYTES('SHA2_256','ö'), 2)
Sql server Output B0B2988B6BBE724BACDA5E9E524736DE0BC7DAE41C46B4213C50E1D35D4E5F13
Output from pelock: 6DBD11FD012E225B28A5D94A9B432BC491344F3E92158661BE2AE5AE2B8B1AD8

I want the SQL server output to match to pelock. I have tested outputs from other sources (Snowflake and python) and all of it aligns with output from pelock. Not sure why SQL server is not giving the right result. Any help is much appreciated.

Upvotes: 2

Views: 523

Answers (1)

Charlieface
Charlieface

Reputation: 71995

You have two issues:

  • The literal text itself is being reinterpreted, because you have the wrong database collation. You can use the N prefix to prevent that, but this leads to a second problem...
  • The value from pelock is UTF-8, but using N means it will be UTF-16 nvarchar.

So you need to use a UTF-8 binary collation, the N prefix and cast it back to varchar.

SELECT CONVERT(VARCHAR(MAX), HASHBYTES('SHA2_256',CAST(N'ö' COLLATE Latin1_General_100_BIN2_UTF8 AS varchar(100))), 2)
Result
6DBD11FD012E225B28A5D94A9B432BC491344F3E92158661BE2AE5AE2B8B1AD8

db<>fiddle

UTF-8 collations are only supported in SQL Server 2019 and later. In older version you would need to find a different collation that deals with the characters you have. It may not be possible to find a collation that deals with all of your data.

Upvotes: 3

Related Questions