Reputation: 33
So I've found similar resources that address how to do this in SQL, like this: Duplicating records to fill gap between dates
I understand that BigQuery may not be the best place to do this, so I'm trying to see if it's at all possible. When trying to run some of the methods in the link above above I'm hitting a wall as some of the functions aren't supported within BigQuery.
If a table exists with data structured like so:
MODIFY_DATE SKU STORE STOCK_ON_HAND
08/01/2016 00:00:00 1120010 21 100
08/05/2016 00:00:00 1120010 21 75
08/07/2016 00:00:00 1120010 21 40
How can I build a query within Google BigQuery that yields an output like the one below? A value at a given date is repeated until the next change for the dates in between:
MODIFY_DATE SKU STORE STOCK_ON_HAND
08/01/2016 00:00:00 1120010 21 100
08/02/2016 00:00:00 1120010 21 100
08/03/2016 00:00:00 1120010 21 100
08/04/2016 00:00:00 1120010 21 100
08/05/2016 00:00:00 1120010 21 75
08/06/2016 00:00:00 1120010 21 75
08/07/2016 00:00:00 1120010 21 40
I know I need to generate a table that has all the dates within a given range, but I'm having a hard time understanding if this can be done. Any ideas?
Upvotes: 3
Views: 1486
Reputation: 172993
How can I build a query within Google BigQuery that yields an output like the one below? A value at a given date is repeated until the next change for the dates in between
See example below
SELECT
MODIFY_DATE,
MAX(SKU_TEMP) OVER(PARTITION BY grp) AS SKU,
MAX(STORE_TEMP) OVER(PARTITION BY grp) AS STORE,
MAX(STOCK_ON_HAND_TEMP) OVER(PARTITION BY grp) AS STOCK_ON_HAND,
FROM (
SELECT
DAY AS MODIFY_DATE, SKU AS SKU_TEMP, STORE AS STORE_TEMP, STOCK_ON_HAND AS STOCK_ON_HAND_TEMP,
COUNT(SKU) OVER(ORDER BY DAY ASC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS grp,
FROM (
SELECT DATE(DATE_ADD(TIMESTAMP("2016-08-01"), pos - 1, "DAY")) AS DAY
FROM (
SELECT ROW_NUMBER() OVER() AS pos, *
FROM (FLATTEN((
SELECT SPLIT(RPAD('', 1 + DATEDIFF(TIMESTAMP("2016-08-07"), TIMESTAMP("2016-08-01")), '.'),'') AS h
FROM (SELECT NULL)),h
)))
) AS DATES
LEFT JOIN (
SELECT DATE(MODIFY_DATE) AS MODIFY_DATE, SKU, STORE, STOCK_ON_HAND
FROM
(SELECT "2016-08-01" AS MODIFY_DATE, "1120010" AS SKU, 21 AS STORE, 75 AS STOCK_ON_HAND),
(SELECT "2016-08-05" AS MODIFY_DATE, "1120010" AS SKU, 22 AS STORE, 100 AS STOCK_ON_HAND),
(SELECT "2016-08-07" AS MODIFY_DATE, "1120011" AS SKU, 23 AS STORE, 40 AS STOCK_ON_HAND),
) AS TABLE_WITH_GAPS
ON TABLE_WITH_GAPS.MODIFY_DATE = DATES.DAY
)
ORDER BY MODIFY_DATE
Upvotes: 3
Reputation: 172993
I need to generate a table that has all the dates within a given range, but I'm having a hard time understanding if this can be done. Any ideas?
SELECT DATE(DATE_ADD(TIMESTAMP("2016-08-01"), pos - 1, "DAY")) AS DAY
FROM (
SELECT ROW_NUMBER() OVER() AS pos, *
FROM (FLATTEN((
SELECT SPLIT(RPAD('', 1 + DATEDIFF(TIMESTAMP("2016-08-07"), TIMESTAMP("2016-08-01")), '.'),'') AS h
FROM (SELECT NULL)),h
)))
Upvotes: 0