Chug
Chug

Reputation: 63

Lag or first_value doesn't work when year is changed

    with
    --my input--
    x(id1,med_id,dt,dt_tm,casemgr_id,casemgr_clntid,status) as (
    select 123456,98410,date'2024-04-19',timestamp'2024-04-19 09:00:00',12345,67891,-2
    union all
    select 194567,98410,date'2024-04-19',timestamp'2024-04-19 11:00:00',12345,67891,-2
    union all
    select 789101,98410,date'2024-04-24',timestamp'2024-04-24 09:00:00',12345,67891,-2
    union all
    select 194587,98410,date'2024-04-25',timestamp'2024-04-25 09:00:00',12345,67891,-2
    union all
    select 234561,98410,date'2024-04-26',timestamp'2024-04-26 09:00:00',12345,67891,-2
    union all
    select 456789,91956,date'2024-04-19',timestamp'2024-04-19 08:00:00',99012,87567,-2
    union all
    select 998415,91956,date'2024-12-20',timestamp'2024-12-20 07:00:00',99012,87567,-2
union all
  select 01786,18745,date'2023-11-16',timestamp'2023-11-16 07:00:00',12434,87438,-2
union all
  select 01786,18745,date'2023-11-21',timestamp'2023-11-21 07:00:00',12434,87438,-2
union all
  select 01786,18745,date'2023-12-01',timestamp'2023-12-01 07:00:00',12434,87438,-2
union all
  select 01786,18745,date'2023-12-04',timestamp'2023-12-04 07:00:00',12434,87438,-2
union all
  select 01786,18745,date'2024-02-02',timestamp'2024-02-02 07:00:00',12434,87438,-2
union all
  select 01786,18745,date'2024-07-05',timestamp'2024-07-05 07:00:00',12434,87438,-2
union all
  select 01786,18745,date'2024-07-21',timestamp'2024-07-21 07:00:00',12434,87438,-2
union all
  select 01786,18745,date'2024-07-21',timestamp'2024-07-23 07:00:00',12434,87438,-2
    );

Input above. What I am trying to achieve is that for a combination of med_id, casemgr_clntid and dt, the first record sorted by date should be valid, and the subsequent records should be invalid if it falls within 180 days. Then the code should look for the first record after the 180 day timeframe, that record should be valid, and this record should be the starting point and the subsequent rows should be invalid. If greater than 180 days, it should be valid. if there are multiple records on the same day, like in the first two instances, the earlier record should be valid since it's earlier timestamp and the other should be invalid since it's later in the day, In other words, the first record at 9am should be valid while the second record should be invalid. I have also put the desired output at the end. So for example in the last three rows, where the record on 5th july is valid since it's the first record after the 180 day time window, but subsequently it should be invalid (in other words, it has restarted the time window). So the records on 21st July and 23rd july should take the record on 5th july as a starting point

-- sql qry which was suggested, tweaked to explictly put the date format...
SELECT
  id1
, med_id
, dt
,dt_tm
, casemgr_id
, casemgr_clntid
, CASE WHEN (
           to_char(dt,'yyyy-mm-dd)' -
         , lag(to_char(dt,'yyyy-mm-dd')) OVER(PARTITION BY med_id, casemgr_clntid ORDER BY dt_tm)
         )  is null
      OR (
           to_char(dt,'yyyy-mm-dd)' -
         , lag(to_char(dt,'yyyy-mm-dd')) OVER(PARTITION BY med_id, casemgr_clntid ORDER BY dt_tm)

         )  >180
    THEN 'valid'
    ELSE 'invalid'
  END AS status
FROM x
ORDER BY med_id, casemgr_clntid, dt_tm 
;

I also tried subtracting by dt_tm, but the dt_tm-first_value(dt_tm) doesn't work.

Error: invalid operation

    with
        --my desired output--
        x(id1,med_id,dt,dt_tm,casemgr_id,casemgr_clntid,status) as (
        select 123456,98410,date'2024-04-19',timestamp'2024-04-19 09:00:00',12345,67891, 'valid'
        union all
        select 194567,98410,date'2024-04-19',timestamp'2024-04-19 11:00:00',12345,67891,'invalid'
        union all
        select 789101,98410,date'2024-04-24',timestamp'2024-04-24 09:00:00',12345,67891,'invalid'
        union all
        select 194587,98410,date'2024-04-25',timestamp'2024-04-25 09:00:00',12345,67891,'invalid'
        union all
        select 234561,98410,date'2024-04-26',timestamp'2024-04-26 09:00:00',12345,67891,'invalid'
        union all
        select 456789,91956,date'2024-04-19',timestamp'2024-04-19 08:00:00',99012,87567,'valid'
        union all
        select 998415,91956,date'2024-12-20',timestamp'2024-12-20 07:00:00',99012,87567,'valid'
union all
  select 01786,18745,date'2023-11-16',timestamp'2023-11-16 07:00:00',12434,87438,valid
union all
  select 01786,18745,date'2023-11-21',timestamp'2023-11-21 07:00:00',12434,87438,invalid
union all
  select 01786,18745,date'2023-12-01',timestamp'2023-12-01 07:00:00',12434,87438,invalid
union all
  select 01786,18745,date'2023-12-04',timestamp'2023-12-04 07:00:00',12434,87438,invalid
union all
  select 01786,18745,date'2024-02-02',timestamp'2024-02-02 07:00:00',12434,87438,invalid
union all
  select 01786,18745,date'2024-07-05',timestamp'2024-07-05 07:00:00',12434,87438,valid
union all
  select 01786,18745,date'2024-07-21',timestamp'2024-07-21 07:00:00',12434,87438,invalid
union all
  select 01786,18745,date'2024-07-23',timestamp'2024-07-23 07:00:00',12434,87438,invalid

        );

Upvotes: 0

Views: 121

Answers (2)

nbk
nbk

Reputation: 49393

You need LAG and the check for NULL which marks the first value, that is valid and then you need to check also if 180 day is already past which you check with > 180

    with
    --my input--
    x(id1,med_id,dt,dt_tm,casemgr_id,casemgr_clntid,status) as (
    select 123456,98410,date'2024-04-19',timestamp'2024-04-19 09:00:00',12345,67891,-2
    union all
    select 194567,98410,date'2024-04-19',timestamp'2024-04-19 11:00:00',12345,67891,-2
    union all
    select 789101,98410,date'2024-04-24',timestamp'2024-04-24 09:00:00',12345,67891,-2
    union all
    select 194587,98410,date'2024-04-25',timestamp'2024-04-25 09:00:00',12345,67891,-2
    union all
    select 234561,98410,date'2024-04-26',timestamp'2024-04-26 09:00:00',12345,67891,-2
    union all
    select 456789,91956,date'2024-04-19',timestamp'2024-04-19 08:00:00',99012,87567,-2
    union all
    select 998415,91956,date'2024-12-20',timestamp'2024-12-20 07:00:00',99012,87567,-2
          union all
    select 998425,91956,date'2025-05-21',timestamp'2025-05-21 07:00:00',99012,87567,-2
union all
  select 01786,18745,date'2023-11-16',timestamp'2023-11-16 07:00:00',12434,87438,-2
union all
  select 01786,18745,date'2023-11-21',timestamp'2023-11-21 07:00:00',12434,87438,-2
union all
  select 01786,18745,date'2023-12-01',timestamp'2023-12-01 07:00:00',12434,87438,-2
union all
  select 01786,18745,date'2023-12-04',timestamp'2023-12-04 07:00:00',12434,87438,-2
union all
  select 01786,18745,date'2024-02-02',timestamp'2024-02-02 07:00:00',12434,87438,-2
union all
  select 01786,18745,date'2024-07-05',timestamp'2024-07-05 07:00:00',12434,87438,-2
    ),
      CTE as (
SELECT
  id1
, med_id
, dt
,dt_tm
, casemgr_id
, casemgr_clntid
      ,           dt -
          LAG(dt) OVER(PARTITION BY med_id, casemgr_clntid ORDER BY dt_tm) dayspassed

FROM x),
      CTE2 AS(
      SELECT
        id1
, med_id
, dt
,dt_tm
, casemgr_id
, casemgr_clntid
      ,dayspassed
      ,SUM(dayspassed) OVER (PARTITION BY med_id, casemgr_clntid ORDER BY dt_tm)/180 n
      FROM CTE)
      SELECT
              id1
, med_id
, dt
,dt_tm
, casemgr_id
      
, casemgr_clntid
      ,dayspassed
      ,CASE WHEN (n IS NULL)
      OR
      (n > CASE WHEN LAG(n) OVER (PARTITION BY med_id, casemgr_clntid ORDER BY dt_tm) IS NULL 
      THEN 0 ELSE LAG(n) OVER (PARTITION BY med_id, casemgr_clntid ORDER BY dt_tm) END)
      THEN 'Valid'
      ELSE 'invalid'
      END status
      FROM CTE2
ORDER BY med_id, casemgr_clntid, dt_tm 
;
id1 med_id dt dt_tm casemgr_id casemgr_clntid dayspassed status
1786 18745 2023-11-16 2023-11-16 07:00:00 12434 87438 null Valid
1786 18745 2023-11-21 2023-11-21 07:00:00 12434 87438 5 invalid
1786 18745 2023-12-01 2023-12-01 07:00:00 12434 87438 10 invalid
1786 18745 2023-12-04 2023-12-04 07:00:00 12434 87438 3 invalid
1786 18745 2024-02-02 2024-02-02 07:00:00 12434 87438 60 invalid
1786 18745 2024-07-05 2024-07-05 07:00:00 12434 87438 154 Valid
456789 91956 2024-04-19 2024-04-19 08:00:00 99012 87567 null Valid
998415 91956 2024-12-20 2024-12-20 07:00:00 99012 87567 245 Valid
998425 91956 2025-05-21 2025-05-21 07:00:00 99012 87567 152 Valid
123456 98410 2024-04-19 2024-04-19 09:00:00 12345 67891 null Valid
194567 98410 2024-04-19 2024-04-19 11:00:00 12345 67891 0 invalid
789101 98410 2024-04-24 2024-04-24 09:00:00 12345 67891 5 invalid
194587 98410 2024-04-25 2024-04-25 09:00:00 12345 67891 1 invalid
234561 98410 2024-04-26 2024-04-26 09:00:00 12345 67891 1 invalid
SELECT 14

Upvotes: 0

Thorsten Kettner
Thorsten Kettner

Reputation: 95082

If I am not mistaken, the dt column is redundant, as the date is included in dt_tm. And the whole thing is not about the first dt per med_id and casemgr_clntid anyway, but about their first (i.e. minimum) dt_tm.

SELECT
  id1,
  med_id,
  dt,
  dt_tm,
  casemgr_id,
  casemgr_clntid,
  CASE WHEN dt_tm = MIN(dt_tm) OVER(PARTITION BY med_id, casemgr_clntid)
         OR dt_tm >= CAST(MIN(dt_tm) OVER(PARTITION BY med_id, casemgr_clntid) AS DATE)
                     + INTERVAL '181' DAY
    THEN 'valid'
    ELSE 'invalid'
  END AS status
FROM x
ORDER BY med_id, casemgr_clntid, dt_tm;

Upvotes: 0

Related Questions