Reputation: 1611
I have a table with information about service involvement and group assignments for two individuals. A person is identified by UNIQ_ID
; service start and end dates by START_DT
and END_DT
; and their group assignment by GRP
, which is assigned on ASSIGN_DT
.
UNIQ_ID START_DT END_DT ASSIGN_DT GRP
888 1-Mar 10-Mar 3-Mar Red
888 1-Mar 10-Mar 7-Mar Yellow
999 15-Mar 20-Mar 15-Mar Yellow
A person does not always get their group assignment on their first day - 888 does not, while 999 does. Until a person gets their group assignment, their GRP should be 'Unknown'. Assignment starts on the day of assignment, rather than the following day, and ends the day before the next group assignment, or when their service ends.
Can anyone help me with an efficient way to get to the following result set?
UNIQ_ID ASSIGN_START_DT ASSIGN_END_DT GRP
888 1-Mar 2-Mar Unknown
888 3-Mar 6-Mar Red
888 7-Mar 10-Mar Yellow
999 15-Mar 20-Mar Yellow
Reproducible code below:
CREATE TABLE EXAMPLE_1
(
UNIQ_ID NUMBER,
START_DT DATE,
END_DT DATE,
ASSIGN_DT DATE,
GRP VARCHAR2 (10)
);
alter session set nls_date_format='dd-mon-yy';
INSERT INTO EXAMPLE_1 VALUES (888, TO_DATE('01-MAR-20'), TO_DATE('10-MAR-20'), TO_DATE('03-MAR-20'), 'Red');
INSERT INTO EXAMPLE_1 VALUES (888, TO_DATE('01-MAR-20'), TO_DATE('10-MAR-20'), TO_DATE('07-MAR-20'), 'Yellow');
INSERT INTO EXAMPLE_1 VALUES (999, TO_DATE('15-MAR-20'), TO_DATE('20-MAR-20'), TO_DATE('15-MAR-20'), 'Yellow');
This is my current solution. Is there any way to do this more efficiently?
WITH SEQ1
AS (SELECT DISTINCT "UNIQ_ID", "DT", "END_DT"
FROM EXAMPLE_1
UNPIVOT
("DT" FOR "TIMES" IN ("START_DT", "ASSIGN_DT"))),
SEQ2
AS (SELECT t1."UNIQ_ID",
"DT",
NVL (
LEAD ("DT" - 1, 1)
OVER (PARTITION BY "UNIQ_ID" ORDER BY "DT"),
"END_DT")
AS "END_DT"
FROM SEQ1 t1),
SEQ3
AS (SELECT t1.*, NVL (t2."GRP", 'Unknown') AS "GRP"
FROM SEQ2 t1
LEFT JOIN EXAMPLE_1 t2
ON t1."UNIQ_ID" = t2."UNIQ_ID"
AND t2."ASSIGN_DT" BETWEEN t1."DT" AND t1."END_DT")
SELECT *
FROM SEQ3
ORDER BY 1, 2;
Upvotes: 0
Views: 58
Reputation: 1269633
My idea is to combine the state changes into a single table and then use window functions. This is a little complicated because of overlapping timeframes. But for the data you provided, this works:
with ug as (
select uniq_id, grp, assign_dt, end_dt
from example_1
union all
select uniq_id, null, start_dt, assign_dt
from example_1
where assign_dt <> start_dt
)
select uniq_id, grp, assign_dt,
lead(assign_dt, 1, min(end_dt) - 1) over (partition by uniq_id order by assign_dt) as end_dt
from ug
group by uniq_id, grp, assign_dt
order by uniq_id, assign_dt;
Here is a db<>fiddle.
Upvotes: 1
Reputation: 8655
From what I see there is no need in "Unknown" group, so you can easily get all real assignments with lead():
WITH SEQ1
AS (SELECT
"UNIQ_ID",
assign_dt,
least(end_dt, lead(assign_dt-1,1,end_dt)over(partition by uniq_id order by assign_dt)) as end_dt,
grp
FROM EXAMPLE_1)
SELECT *
FROM SEQ1
ORDER BY 1, 2;
And your original query doesn't work as you described, please try this sample data:
begin
INSERT INTO EXAMPLE_1 VALUES (888, TO_DATE('01-MAR-20'), TO_DATE('10-MAR-20'), TO_DATE('03-MAR-20'), 'Red');
INSERT INTO EXAMPLE_1 VALUES (888, TO_DATE('01-MAR-20'), TO_DATE('10-MAR-20'), TO_DATE('07-MAR-20'), 'Yellow');
INSERT INTO EXAMPLE_1 VALUES (999, TO_DATE('15-MAR-20'), TO_DATE('20-MAR-20'), TO_DATE('15-MAR-20'), 'Yellow');
-- extra:
INSERT INTO EXAMPLE_1 VALUES (111, TO_DATE('01-JUN-20'), TO_DATE('10-JUN-20'), TO_DATE('05-JUN-20'), 'Yellow');
INSERT INTO EXAMPLE_1 VALUES (111, TO_DATE('15-JUN-20'), TO_DATE('20-JUN-20'), TO_DATE('17-JUN-20'), 'Red');
INSERT INTO EXAMPLE_1 VALUES (111, TO_DATE('15-JUN-20'), TO_DATE('20-JUN-20'), TO_DATE('19-JUN-20'), 'Green');
INSERT INTO EXAMPLE_1 VALUES (111, TO_DATE('26-JUN-20'), TO_DATE('30-JUN-20'), TO_DATE('27-JUN-20'), 'Yellow');
INSERT INTO EXAMPLE_1 VALUES (111, TO_DATE('25-JUN-20'), TO_DATE('29-JUN-20'), TO_DATE('28-JUN-20'), 'Black');
commit;
end;
/
And describe what exactly do you want to see for 111
Upvotes: 1