Reputation: 39
I have dataset as shown in below image
When is_session_change value is true i am populating new value for session column and i want to carry that session value till the next TRUE value of is_session_change column.
For example below image -
Column expected_session is the expected value. Please let me know if anyone can help!! Thanks in advance. Actually i am processing the data hourly, and for particular run i am processing current hour data plus last hour data, so last hour data will have session ids attached properly, now i am trying to extend those session ids for eligible records in current hour, if record is not eligible to extend i am generating new session id.
I have tried below approach which have populated values in session column -
select *,case when (session_id != '-1') then session_id
when (new_session = true) then getSessionId()
when (new_session =false AND session_id = '-1') then LAG(session_id) OVER (PARTITION BY uniquevisitor ORDER BY ts)
else '-' END as abc
from mktg_web.web_session_final
order by uniquevisitor
Upvotes: 0
Views: 269
Reputation: 49270
You already have the occurrence column to group the values based on session. Use a case
expression to set -1
values to null
and use max
window function to get the expected_session value.
select w.*,max(case when session <> -1 then session end)
over(partition by uniquevisitor,occurrence) as expected_session
from mktg_web.web_session_final w
Upvotes: 1