alt-f4
alt-f4

Reputation: 2306

How can I port bit string type from Postgres to Snowflake?

Context Currently I am hashing values in Postgresql database using a salt. I am trying to migrate the same functionality from Postgres to Snowflake.

What I have tried to do

Currently data is being hashed in Postgres like:

SELECT ('x' || md5('bla'::TEXT || (SELECT 'saltysalt')))::BIT(64)::BIGINT;

I am trying to replicate the same procedure for Snowflake. I am trying to take it step by step and this is where I have stopped:

-- Not sure how can I apply the BIT(64)[docs: https://www.postgresql.org/docs/8.1/datatype-bit.html] type to Snowflake
SELECT ('x' || md5('bla'::TEXT || (SELECT 'saltysalt')));

I have tried using Snowflake's native TO_BINARY function, but I get an error like:

The following string is not a legal base64-encoded value: 'x6d0d80fc2ace4e80'

Summary

How can I port my hashing methodology from Postgres to Snowflake? What would be the equivalent of bit-string types in Snowflake?

Upvotes: 0

Views: 594

Answers (1)

Gokhan Atil
Gokhan Atil

Reputation: 10039

I did not understand why you produce a decimal after hashing, or why you use SELECT for a literal.

As I see, you query can be written like below in Postgres:

SELECT ('x' || md5('bla' || 'saltysalt'))::BIT(64)::BIGINT;

7858078745369661000 <--- and this is the output

So what you do here is, add a salt to string, calculate the MD5, convert it to binary, get only first 64 bits (to avoid benefit of MD5 hashing), then convert it to bigint.

I see a problem here. The BIGINT can keep 8 digits but it is designed to store signed values. On the other hand, your original hexadecimal is not a signed value, so the conversion would not be correct. Let me demonstrate:

SELECT md5('bla' || 'saltysalt');

6d0d80fc2ace4e80c47bfd51f39f42f6 <-- output of above command

SELECT ('x' || md5('bla' || 'saltysalt'))::BIT(64)::BIGINT;

7858078745369661000 <-- you save the first 8 digits as bigint

select to_hex(7858078745369661000);

6d0d80fc2ace4e48 <-- when you convert back to hex, you get a different number!!! compare with the first output

If there were no issue on hex/bigint conversions, your query could be easily written like this in Snowflake:

SELECT to_number(LEFT(md5('bla' || 'saltysalt'),16),'XXXXXXXXXXXXXXXX');

Upvotes: 1

Related Questions