Chug
Chug

Reputation: 63

For duplicate rows, creating a column that shows status based on date

I am hoping for some help with the scenario below in SQL

Current State:

enter image description here

For a particular combination of med_id, casemgr_id, casemgr_clnt_id, the status should be reflected accordingly:

Desired Output:

enter image description here

In the first scenario, for med_id=98410, casemgr_id=12345, casemgr_clnt_id=67891, the date (1strow) is valid, but for the same rows, if the dates fall within 6 months, then the status should be invalid. In the second scenario, for med_id=91956, casemgr_id=99012, casemgr_clnt_id=87567, the date (1strow) is valid, but for the second row, the status is valid since the date is greater than 6 months. In summary, for duplicate rows for med_id, casemgr_id and casemgr_clntid, the status for the first row will be valid, but for the subsequent row, it will be valid id the date is > 6 months, otherwise it will be invalid if less than 6 months.

How do I achieve this in SQL? Please assist!

Upvotes: 0

Views: 80

Answers (2)

marcothesane
marcothesane

Reputation: 6749

You can't without window functions, as @Jan Suchanek already showed.

Mine is similar - it also uses LAG(), but it uses some sort of "anchor" using FIRST_VALUE() , to use MONTHS_BETWEEN() upon:

WITH
-- your input ...
indata(id1,med_id,date,casemgr_id,casemgr_cln,status) AS (
          SELECT 123456,98410,DATE '4/19/24',12345,67891,-2
UNION ALL SELECT 789101,98410,DATE '4/24/24',122345,67891,-2
UNION ALL SELECT 234561,98410,DATE '4/25/24',12345,67891,-2
UNION ALL SELECT 567890,98410,DATE '4/26/24',12345,67891,-2
UNION ALL SELECT 456789,91956,DATE '4/20/24',99012,87567,-2
UNION ALL SELECT 998415,91956,DATE '12/20/24',99012,87567,-2
)
-- end of input, real query starts here ...
SELECT
  id1
, med_id
, date
, casemgr_id
, casemgr_cln
, CASE
    WHEN MONTHS_BETWEEN(
           date
         , FIRST_VALUE(date) OVER(PARTITION BY med_id, casemgr_id, casemgr_cln ORDER BY date)
         )  > 6
      OR MONTHS_BETWEEN(
           date
         , FIRST_VALUE(date) OVER(PARTITION BY med_id, casemgr_id, casemgr_cln ORDER BY date)
         )  = 0
    THEN 'valid'
    ELSE 'invalid'
  END AS status
FROM indata
ORDER BY med_id, casemgr_id, casemgr_cln, date 
;
id1 med_id date casemgr_id casemgr_cln status
456,789 91,956 2024-04-20 99,012 87,567 valid
998,415 91,956 2024-12-20 99,012 87,567 valid
123,456 98,410 2024-04-19 12,345 67,891 valid
234,561 98,410 2024-04-25 12,345 67,891 invalid
567,890 98,410 2024-04-26 12,345 67,891 invalid
789,101 98,410 2024-04-24 122,345 67,891 valid

Upvotes: 0

Jan Suchanek
Jan Suchanek

Reputation: 301

It might be someting like

SELECT 
    id1,
    med_id,
    date,
    casemgr_id,
    casemgr_clnt_id,
    CASE 
        WHEN ROW_NUMBER() OVER (PARTITION BY med_id, casemgr_id, casemgr_clnt_id ORDER BY date) = 1 THEN 'valid'
        WHEN MONTHS_BETWEEN(date, 
                LAG(date) OVER (PARTITION BY med_id, casemgr_id, casemgr_clnt_id ORDER BY date)) <= 6 THEN 'invalid'
        ELSE 'valid'
    END AS status
FROM your_table
ORDER BY med_id, casemgr_id, casemgr_clnt_id, date;

Upvotes: 1

Related Questions