Liam
Liam

Reputation: 159

Generate a column of random dates in BigQuery

How would I go about creating a new column of randomly selected dates with bigquery? This is closely related to this question, but the dates should be random and not joined by anything. Ideally, it would just be a column of IDs and a column of random dates in a given range:

ID    Date
1     2010-01-01
2     2012-03-31
3     2011-07-11
4     2015-06-09

This is easier on regular sql, but I cannot get this approach to work with the bigquery sql syntax.

Upvotes: 1

Views: 2810

Answers (2)

Maxime Oriol
Maxime Oriol

Reputation: 143

On BigQuery by using the RAND function on a delimited time window:

-- Parameters
DECLARE start_dt DEFAULT DATE '2010-01-01'; -- Lower limit
DECLARE end_dt DEFAULT date '2020-12-31'; -- Upper limit
DECLARE n DEFAULT 30; -- Quantity of random dates

-- Random dates
SELECT DATE_ADD(start_dt , INTERVAL CAST(DATE_DIFF(end_dt, start_dt, DAY)*RAND() AS INT64) DAY) random_date
FROM UNNEST(GENERATE_ARRAY(1, n));

Upvotes: 1

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172974

Below is for BigQuery Standard SQL

#standardSQL
WITH parameters AS (
  SELECT 100 ids_count, DATE '2010-01-01' start_date, DATE '2020-12-31' finish_date
)
SELECT id, DATE_FROM_UNIX_DATE(CAST(start + (finish - start) * RAND() AS INT64)) random_date
FROM parameters, 
UNNEST(GENERATE_ARRAY(1, ids_count)) id,
UNNEST([STRUCT(UNIX_DATE(start_date) AS start, UNIX_DATE(finish_date) AS finish)])
-- ORDER BY id

Upvotes: 4

Related Questions