Shivam Gupta
Shivam Gupta

Reputation: 82

Redshift query for diff finding not working

Input:

company_id  description  timestamp           user_id 
1           logged in    2019-08-26 06:37:16   1
1           logged out    2019-08-26 06:39:16  1
1           logged in    2019-08-26 06:39:16   2
1           logged out    2019-08-26 07:37:16  2

In redshift, I want to get diff((logged-in)-(logged-out))

expected output

company_id  description  timestamp           user_id  diff(seconds)
1           logged in    2019-08-26 06:37:16   1      120 
1           logged out    2019-08-26 06:39:16  1      NULL
1           logged in    2019-08-26 06:39:16   2      100000
1           logged out    2019-08-26 07:37:16  2      Null

I tried this but was unable to get result as expected:

Redshift query for finding difference between login and logout not working

select
    timestamp, LogStartTime, LoginEndTime
    , Total_Time = right(concat('0', Total_Time / 3600), 2) + ':' + right(concat('0', Total_Time % 3600 / 60), 2) + ':' + right(concat('0', Total_Time % 60), 2)
    , OnFloor = right(concat('0', OnFloor / 3600), 2) + ':' + right(concat('0', OnFloor % 3600 / 60), 2) + ':' + right(concat('0', OnFloor % 60), 2)
    , OffFloor = right(concat('0', OffFloor / 3600), 2) + ':' + right(concat('0', OffFloor % 3600 / 60), 2) + ':' + right(concat('0', OffFloor % 60), 2)
    , user_id
from (
    select
        user_id, timestamp = cast(min(timestamp) as date)
        , LogStartTime = min(timestamp)
        , LoginEndTime = max(timestamp)
        , Total_Time = sum(ss)
        , OnFloor = sum(iif(rtrim(ltrim(description)) = 'logged in', ss, 0))
        , OffFloor = sum(iif(rtrim(ltrim(description)) = 'logged out', ss, 0))
    from (
        select
            *,ss = datediff(ss,timestamp,lead(timestamp) over (partition by user_id, order by timestamp))
        from (
            select
                *, grp = sum(diff) over (partition by user_id order by timestamp)
            from (
                select
                    *, diff = iif(datediff(mi, lag(timestamp) over (partition by user_id order by timestamp), timestamp) > 300 and description = 'logged in', 1, 0)
                from
                    webactivities_9hmg8q
            ) t
        ) t
    ) t
    group by user_id, grp
) t
company_id  description  timestamp           user_id  diff(seconds)
1           logged in    2019-08-26 06:37:16   1      120 
1           logged out    2019-08-26 06:39:16  1      NULL
1           logged in    2019-08-26 06:39:16   2      100000
1           logged out    2019-08-26 07:37:16  2      Null

Upvotes: 1

Views: 74

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270713

You can use window functions:

select wa.*,
       (case when description = 'logged in' and
                  next_description = 'logged out'
             then datediff(second, timestamp, next_timestamp)
        end)
from (select wa.*,
             lead(description) over (partition by company_id, user_id order by timestamp) as next_description,
             lead(timestamp) over (partition by company_id, user_id order by timestamp) as next_timestamp
      from webactivities_9hmg8q wa
     ) wa;

Upvotes: 1

Related Questions