Reputation: 109
I need a SQL Server equivalent Hashbytes SHA1 output from Snowflake
SQL Server:
select hashbytes('sha1',cast('214163915155286000' as varchar(18)))*1
or
select cast(hashbytes('sha1',cast('214163915155286000' as varchar(18))) as int)
SQL Server Output:
2143072043
I am able to generate hashbytes output from snowflake using below but now I am unable to convert it into numeric value
select to_char(to_binary(sha1('214163915155286000'), 'hex'), 'base64') as Result;
Partial Snowflake Output :
N0VDrFqYkK+M2GPrfJjnRn+8rys=
Expected Output from Snowflake:
2143072043
FYI - I have tried SQL Server Code here
http://sqlfiddle.com/#!18/9eecb/150528
Upvotes: 1
Views: 1039
Reputation: 25978
So doing a small amount of playing in SQL Server.
select '214163915155286000' as s,
hashbytes('sha1', '214163915155286000') as h,
cast(hashbytes('sha1', '214163915155286000') as int) as i
;
Gives:
s, 214163915155286000
h, 0x374543AC5A9890AF8CD863EB7C98E7467FBCAF2B
i, 2143072043
so my SQL is the same as your, as good start.
converting 2143072043
to hex we get 7FBCAF2B
which is the last 4 bytes of the hash.
Thus the you want to get the result of hash and truncate, which I will BITAND to achieve, but last time I used the snowflake BIT functions they did allow hex input, so in stead of type a rather clear 0xFFffFFff
we will use the decimal of that 4294967295
, thus this should work for you:
select bitand(sha1('214163915155286000'), 4294967295);
Right so thanks to Gokhan's insights, and logging into snowflake, and reading the manual for BITSHIFTLEFT/RIGHT. we can use the shifts, but the output is a 128 bit number, not a 64 bit, as I had assumed, to extend the sign bit correctly we have to shift by 96 bits, which this code shows working:
SELECT
column1 as input,
sha1(input) as sha1,
right(sha1,8) as right8,
to_number(right8,'XXXXXXXX') as int32,
BITSHIFTRIGHT(BITSHIFTLEFT(int32,96),96) as result
FROM VALUES
('214163915155286001'),
('214163915155286000')
ORDER BY 1;
which gives the output:
INPUT | SHA1 | RIGHT8 | INT32 | RESULT |
---|---|---|---|---|
214163915155286000 | 374543ac5a9890af8cd863eb7c98e7467fbcaf2b | 7fbcaf2b | 2143072043 | 2143072043 |
214163915155286001 | 1911d3df794846fbc74e0e4cf29133459466e0e7 | 9466e0e7 | 2489770215 | -1805197081 |
so a more compact and final block of SQL can be:
BITSHIFTRIGHT(BITSHIFTLEFT(to_number(right(sha1(input),8),'XXXXXXXX'),96),96)
Upvotes: 1
Reputation: 10099
When I examined your post yesterday, I noticed that Ms SQL Server uses the last 8 digits of the SHA1 Hash (as Simon pointed), so I wrote this query:
select to_number(right( sha1('214163915155286000'), 8 ), 'XXXXXXXX' );
It produces 2143072043 as expected, but unfortunately, this is not enough if we want to simulate the behavior of Ms SQL Server.
If you convert '214163915155286001', the above conversion returns 2489770215. If we do the exact conversion in Ms SQL Server, it returns -1805197081. The problem is, Ms SQL Server, stores the result in a "signed integer". In Snowflake, there is no signed integer. The integer is synonymous with NUMBER. So we need to handle the "signed" bit. This is why I wrote the following UDF:
create or replace function convertMStoSF( SHA varchar )
returns NUMBER
language SQL
as
$$
IFF( bitand( to_number(right( sha1( SHA ), 8 ), 'XXXXXXXX' ) , 2147483648 ) > 0,
- 2147483648 + bitand( 2147483647, to_number(right( sha1( SHA ), 8 ), 'XXXXXXXX' ) ) ,
to_number(right( sha1( SHA ), 8 ), 'XXXXXXXX' ) )
$$;
It gets the last 8 digits, converts to a number, and checks the signed bit. If it's 1, then it calculates the signed value. If it's 0, it returns the number directly. In the UDF, I used the decimal representation instead of the hexadecimal:
0x7FFFFFFF -> 2147483647
0x80000000 -> 2147483648
Here is a query to test the results:
select '214163915155286000' A, convertMStoSF( A ), '214163915155286001' B, convertMStoSF( B );
+--------------------+--------------------+--------------------+--------------------+
| A | CONVERTMSTOSF( A ) | B | CONVERTMSTOSF( B ) |
+--------------------+--------------------+--------------------+--------------------+
| 214163915155286000 | 2143072043 | 214163915155286001 | -1805197081 |
+--------------------+--------------------+--------------------+--------------------+
I have to say that I'm not sure if it's 100% correct, as I do my test with only a few sample numbers.
Upvotes: 1