Kiew
Kiew

Reputation: 15

Google BigQury SHA512 function output vs OpenSSL difference

A hash is not an opinion, why Google BigQuery sha512 function output is inconsistent with OpenSSL?

Google BigQuery output of SELECT SHA512("Hello World"); LHT9F+2v2A6ER7DUZ0HuJDt+t03SFJoKsbkkb7MDgvJ+hT2FhXGeDmfL2g2qj1FnEGRhXWRa4nrLFb+xRH9Fmw==

MariaDB output of SELECT SHA2("Hello World", 512); and OpenSSL output of echo -n "Hello World" | sha512sum both yield: 2c74fd17edafd80e8447b0d46741ee243b7eb74dd2149a0ab1b9246fb30382f27e853d8585719e0e67cbda0daa8f51671064615d645ae27acb15bfb1447f459b

Even a base64 of the output is widely different from BigQuery's: SELECT TO_BASE64(SHA2("Hello World", 512)); MmM3NGZkMTdlZGFmZDgwZTg0NDdiMGQ0Njc0MWVlMjQzYjdlYjc0ZGQyMTQ5YTBhYjFiOTI0NmZi MzAzODJmMjdlODUzZDg1ODU3MTllMGU2N2NiZGEwZGFhOGY1MTY3MTA2NDYxNWQ2NDVhZTI3YWNi MTViZmIxNDQ3ZjQ1OWI=

Upvotes: 1

Views: 174

Answers (3)

dave_thompson_085
dave_thompson_085

Reputation: 38771

sha512sum is NOT OpenSSL; it is a separate utility and usually part of GNU coreutils.

If you do want to use (or compare to) OpenSSL:

# to get hash in hex (matches your MariaDB except for the trailing hyphen, or sha512sum) 
echo -n Hello World | openssl sha512

# to get hash in base64 (matches your BigQuery) either of
echo -n Hello World | openssl sha512 -binary | base64 -w0
echo -n Hello World | openssl sha512 -binary | openssl base64 -A

Also note that echo -n doesn't work on Windows, or some now-mostly-obsolete versions of Unix or Unix shells, whereas on some Unixes or shells echo will modify values containing one or more backslashes (which of course screws up your hash); printf %s 'Hello World' is guaranteed to work right on any POSIX. On Windows CMD you can do set /p="Hello World" <NUL | ... -- but let's just not go there.

Upvotes: 1

Simon Goater
Simon Goater

Reputation: 1898

You're Google BigQuery output is the base64 encoded binary hash, not base64 encoded hexadecimal hash. A simple PHP script shows where the BigQuery comes from.

  $string = "Hello World";
  $hashstring = hash("sha512", $string, false);
  echo $hashstring."\n";
  echo base64_encode(hex2bin($hashstring))."\n";

Output:

2c74fd17edafd80e8447b0d46741ee243b7eb74dd2149a0ab1b9246fb30382f27e853d8585719e0e67cbda0daa8f51671064615d645ae27acb15bfb1447f459b
LHT9F+2v2A6ER7DUZ0HuJDt+t03SFJoKsbkkb7MDgvJ+hT2FhXGeDmfL2g2qj1FnEGRhXWRa4nrLFb+xRH9Fmw==

Upvotes: 1

kiran mathew
kiran mathew

Reputation: 2343

Adding some additional details to @Georg Richter and @Simon Goater answer:

According to the google cloud documentation, SHA512() computes the hash of the input using the SHA-512 algorithm and output returns as 64 bytes.Based on my understanding, the difference between the output is due to the format of return type. If your requirement is to obtain the same hash value for both bigquery and echo command. You can consider using below echo command:

echo -n "Hello World" | sha512sum | xxd -r -p | base64

Example:

BigQuery result:

image

Google cloud shell:

image

Upvotes: 1

Related Questions