Reputation: 115
I am trying to perform something here. I want to have all the coluns filled with values. But when I have the null column, I want to have it filled with the value from the previous not null column.
with cte as (
select '2019-11-12 16:01:55' as timestamp, null as owner_id, null as owner_assigneddate, null as lastmodifieddate union all
select '2019-11-12 19:03:18' as timestamp, 39530934 as owner_id, '2019-11-12 19:03:18' as owner_assigneddate, '2019-11-12 19:03:18' as lastmodifieddate union all
select '2019-11-12 19:03:19' as timestamp, null as owner_id, null as owner_assigneddate, '2019-11-12 19:03:19' as lastmodifieddate union all
select '2019-11-12 19:03:20' as timestamp, null as owner_id, null as owner_assigneddate, '2019-11-12 19:03:20' as lastmodifieddate union all
select '2019-11-12 19:03:31' as timestamp, 40320368 as owner_id, '2019-11-12 19:03:31' as owner_assigneddate, '2019-11-12 19:03:31' as lastmodifieddate union all
select '2019-11-12 19:03:33' as timestamp, null as owner_id, null as owner_assigneddate, '2019-11-12 19:03:33' as lastmodifieddate union all
select '2019-11-12 19:03:56' as timestamp, null as owner_id, null as owner_assigneddate, '2019-11-12 19:03:356' as lastmodifieddate)
select timestamp,
owner_id,
owner_assigneddate,
lastmodifieddate,
COALESCE(owner_id, LEAD(owner_id) OVER(ORDER BY timestamp DESC)) AS test_column
from cte order by timestamp asc
With the previous query I already managed to put the value only in the next row.
What I want to do is to have the all the columns filled with the value based on the previous row. The value from row 4 should be 39530934 and the value from row 7 should be 40320368. I think I am missing something here, but I dont't know what.
Upvotes: 4
Views: 4555
Reputation: 1269673
In BigQuery use LAST_VALUE()
with the IGNORE NULLS
option and COALESCE()
:
select timestamp,
COALESCE(owner_id, last_value(owner_id ignore nulls) over (order by timestamp)) as owner_id,
COALESCE(owner_assigneddate, LAST_VALUE(owner_assigneddate IGNORE NULLS) OVER (ORDER BY TIMESTAMP)) as owner_assigneddate,
COALESCE(lastmodifieddate, LAST_VALUE(lastmodifieddate IGNORE NULLS) OVER (ORDER BY TIMESTAMP)) as lastmodifieddate
from cte order by timestamp asc
Upvotes: 8
Reputation: 222432
As far as concerns, Big Query does not support ignore null
in window functions. Here is a solution that relies on a window max to locate the record that holds the last non-null owner_id
(this assumes uniqueness of timestamps). With this information at hand, you can then bring in the corresponding owner_id
with a join.
select
c.timestamp,
coalesce(c.owner_id, c_lag.owner_id) owner_id,
c.owner_assigneddate,
c.lastmodifieddate
from
(
select
cte.*,
max(case when owner_id is not null then timestamp end)
over(order by timestamp rows unbounded preceding) target_timestamp
from cte
) c
left join cte c_lag
on c.owner_id is null
and c_lag.timestamp = c.target_timestamp
timestamp | owner_id | owner_assigneddate | lastmodifieddate :------------------ | -------: | :------------------ | :------------------- 2019-11-12 16:01:55 | null | null | null 2019-11-12 19:03:18 | 39530934 | 2019-11-12 19:03:18 | 2019-11-12 19:03:18 2019-11-12 19:03:19 | 39530934 | null | 2019-11-12 19:03:19 2019-11-12 19:03:20 | 39530934 | null | 2019-11-12 19:03:20 2019-11-12 19:03:31 | 40320368 | 2019-11-12 19:03:31 | 2019-11-12 19:03:31 2019-11-12 19:03:33 | 40320368 | null | 2019-11-12 19:03:33 2019-11-12 19:03:56 | 40320368 | null | 2019-11-12 19:03:356
Note: to better understand the logic if needed, you can run the inner query independantly to see what it returns (see the db fiddle).
Edit
Rereading this, I find that the information that the window max provides is already available in your original data, in column owner_assigneddate
... So that’s far simpler:
select
c.timestamp,
coalesce(c.owner_id, c_lag.owner_id) owner_id,
c.owner_assigneddate,
c.lastmodifieddate
from
cte c
left join cte c_lag
on c.owner_id is null
and c_lag.timestamp = c.owner_assigneddate
Upvotes: 2
Reputation: 7734
This should work with your cte
definition:
...
select timestamp,
owner_id,
owner_assigneddate,
lastmodifieddate,
LAST_VALUE(owner_id IGNORE NULLS)
OVER(ORDER BY timestamp ASC ROWS BETWEEN
UNBOUNDED PRECEDING AND CURRENT ROW) AS test_column
from cte order by timestamp asc
Upvotes: 2