Reputation: 758
I'm trying to obtain, using Big Query, the table on "Desired result" (see figure below), starting from a Dummy table which can be initialized as follows:
with dummy_data as
(
select * from unnest([
struct(
date('2021-06-10') AS timestamp,
'A' AS category,
'5' as value),
(date('2021-06-11'), 'B', '3'),
(date('2021-06-12'), 'C', '2'),
(date('2021-06-14'), 'A', '7')
])
)
select * from dummy_data
I've made a start by generating a time series and cross joining but having trouble getting rid of the unnecessary rows that arise from this:
with dummy_data as
(
select * from unnest([
struct(
date('2021-06-10') AS timestamp,
'A' AS category,
'5' as values ),
(date('2021-06-11'), 'B', '3'),
(date('2021-06-12'), 'C', '2'),
(date('2021-06-14'), 'A', '7')
])
),
-- Initialize date series that covers needed range
date_series as (
select
*
from
unnest(generate_date_array('2021-06-10', '2021-06-14', interval 1 day)) as date
),
-- cross_join
cross_join as (
select * from dummy_data
cross join date_series
)
select * from cross_join
Which results in the following (rows in blue are the new ones I need, rows in gray the ones I'd like to remove):
Thanks!
Upvotes: 0
Views: 112
Reputation: 172993
Consider below approach
select timestamp, category, value
from (
select timestamp as start_timestamp, category, value, ifnull(
lead(timestamp) over(partition by category order by timestamp) - 1,
max(timestamp) over(order by timestamp desc)
) next_timestamp
from dummy_data
), unnest(generate_date_array(start_timestamp, next_timestamp)) timestamp
if applied to sample data in your question - output is
Upvotes: 2