shankshera
shankshera

Reputation: 985

Random String Function in BigQuery

I was trying to generate random string in BigQuery when I encountered this answer.

SELECT
    word
FROM
    `publicdata.samples.shakespeare`
WHERE
    RAND() < 10/(
    SELECT
        COUNT(*)
    FROM
        `publicdata.samples.shakespeare`)

It worked but I need to create a function out of that answer. This is my attempt to convert it but no luck.

CREATE TEMP FUNCTION
    random_word() AS ( (
        SELECT
            STRING_AGG(word, "_") AS aggd_word
        FROM (
            SELECT
                LOWER(REPLACE(word, "'", "")) AS word
            FROM
                `publicdata.samples.shakespeare`
            WHERE
                RAND() < 10/(
                SELECT
                    COUNT(*)
                FROM
                    `publicdata.samples.shakespeare`)
            LIMIT
                3)) );
SELECT
    random_word();

I got this error

Table not found: `publicdata.samples.shakespeare`;
failed to parse CREATE [TEMP] FUNCTION statement at [25:9]

Upvotes: 0

Views: 4620

Answers (1)

Elliott Brossard
Elliott Brossard

Reputation: 33705

One approach is to convert a hash into characters in the desired range:

CREATE TEMP FUNCTION MapChar(c INT64) AS (
  CASE
    WHEN c BETWEEN 0 AND 9 THEN 48 + c -- 0 to 9
    WHEN c BETWEEN 10 AND 35 THEN 55 + c -- A to Z
    ELSE 61 + c -- a to z
  END
);

CREATE TEMP FUNCTION RandString() AS ((
  SELECT CODE_POINTS_TO_STRING(ARRAY_AGG(MapChar(MOD(c, 62))))
  FROM UNNEST(TO_CODE_POINTS(SHA256(CAST(RAND() AS STRING)))) AS c
));

SELECT RandString();

You can useSHA512 instead of SHA256 if you want a longer string.

Upvotes: 3

Related Questions