Reputation: 159
How would I go about creating a new column of randomly selected dates with bigquery? This is closely related to this question, but the dates should be random and not joined by anything. Ideally, it would just be a column of IDs and a column of random dates in a given range:
ID Date
1 2010-01-01
2 2012-03-31
3 2011-07-11
4 2015-06-09
This is easier on regular sql, but I cannot get this approach to work with the bigquery sql syntax.
Upvotes: 1
Views: 2810
Reputation: 143
On BigQuery by using the RAND
function on a delimited time window:
-- Parameters
DECLARE start_dt DEFAULT DATE '2010-01-01'; -- Lower limit
DECLARE end_dt DEFAULT date '2020-12-31'; -- Upper limit
DECLARE n DEFAULT 30; -- Quantity of random dates
-- Random dates
SELECT DATE_ADD(start_dt , INTERVAL CAST(DATE_DIFF(end_dt, start_dt, DAY)*RAND() AS INT64) DAY) random_date
FROM UNNEST(GENERATE_ARRAY(1, n));
Upvotes: 1
Reputation: 172974
Below is for BigQuery Standard SQL
#standardSQL
WITH parameters AS (
SELECT 100 ids_count, DATE '2010-01-01' start_date, DATE '2020-12-31' finish_date
)
SELECT id, DATE_FROM_UNIX_DATE(CAST(start + (finish - start) * RAND() AS INT64)) random_date
FROM parameters,
UNNEST(GENERATE_ARRAY(1, ids_count)) id,
UNNEST([STRUCT(UNIX_DATE(start_date) AS start, UNIX_DATE(finish_date) AS finish)])
-- ORDER BY id
Upvotes: 4