Reputation: 21
I have a table storing records of vehicles being loaned out. A sample of the data looks like:
ASSET_ID | CATEGORY | LOANED_TO | LOAN_START | LOAN_END |
---|---|---|---|---|
00001 | Keyed | Cordon | 01/01/2022 2:00 PM | 01/03/2022 8:00 AM |
00002 | Keyless | JIP | 01/15/2022 8:30 AM | 01/15/2022 10:00 AM |
00002 | Keyless | Cordon | 01/15/2022 09:30 AM | 01/15/2022 10:30 AM |
00002 | Keyless | Craft | 01/15/2022 12:00 PM | 01/15/2022 06:00 PM |
As you can see from the sample above. Sometimes an asset can be loaned for multiple days (e.g. Asset ID: 0001 from 01/01/2022-01/03/2022) and sometimes the asset can be loaned for hours, including overlapping times (I know this seems like an issue but this is how the data is and overlpaping hours should not be double accounted.
Using the sample data above, Asset ID 0001 should look like the following:
DATE | ASSET_ID | PCT_UNAVAILABLE |
---|---|---|
01/01/2022 | 00001 | .42 |
01/02/2022 | 00001 | 1 |
01/03/2022 | 00001 | .33 |
01/15/2022 | 00002 | .33 |
Using the intial data sample this can be translated as follows:
- Asset ID 00001 was unavilable for 10 hours (2 PM - midnight) on 01/01/2022 - 42 %
- Asset ID 00001 was unavailable for 24 hours (midnight-midnight) on 01/02/2022 - 100%
- Asset ID 00001 was unavilable for 8 hours (midnight-8 AM) on 01/03/2022 - 33%
- Asset ID 00002 was unavailable for 8 hours (08:30 -10:30; 12:00-06:00) on 01/15/2022 - 33%
I was able to create a date table (CALENDAR
) that has a record for each date in a particular date range. This table, joined with the table showing loaned assets, will let me create daily records instead of ranged records; however, I have no idea how to aggregate the time.
My approach so far is to explode the records into daily records and calculate the unavilability percentage by each record, like below:
SELECT
CAL.DATE
,LA.ASSET_ID
,CASE
WHEN loan_end - loan_start >= 1 then 1 else loan_end - loan_start
END PCT_UNAVAILABLE
FROM CALENDAR CAL
JOIN LOANED_ASSETS LA on CAL.DATE >= LA.LOAN_START and CAL.DATE <= LA.LOAN_END
however, if I approach it this way, I am still not aggregating all non-overlapping hours in the day for all records. Any and all help would be greatly appreciated.
Upvotes: 0
Views: 85
Reputation: 168361
You can use:
WITH days ( asset_id, loan_start, day_end, loan_end) AS (
SELECT asset_id,
loan_start,
LEAST(TRUNC(loan_start) + INTERVAL '1' DAY, loan_end),
loan_end
FROM table_name
MATCH_RECOGNIZE (
PARTITION BY asset_id
ORDER BY loan_start, loan_end
MEASURES
FIRST(loan_start) AS loan_start,
MAX(loan_end) AS loan_end
ONE ROW PER MATCH
PATTERN (overlaps* next_time)
DEFINE
overlaps AS MAX(overlaps.loan_end) >= loan_start,
next_time AS MAX(overlaps.loan_end) >= loan_start
OR MAX(overlaps.loan_end) IS NULL
)
UNION ALL
SELECT asset_id,
day_end,
LEAST(TRUNC(day_end) + INTERVAL '1' DAY, loan_end),
loan_end
FROM days
WHERE day_end < loan_end
)
SEARCH DEPTH FIRST BY asset_id, loan_start SET order_id
SELECT asset_id,
TO_CHAR(TRUNC(loan_start), 'YYYY-MM-DD') AS day,
SUM(day_end - loan_start) AS amt_unavailable
FROM days
GROUP BY
asset_id,
TRUNC(loan_start)
Which, for the sample data:
CREATE TABLE table_name (ASSET_ID, LOAN_START, LOAN_END) AS
SELECT '00001', DATE '2022-01-01' + INTERVAL '14:00' HOUR TO MINUTE, DATE '2022-01-03' + INTERVAL '08:00' HOUR TO MINUTE FROM DUAL UNION ALL
SELECT '00002', DATE '2022-01-15' + INTERVAL '08:30' HOUR TO MINUTE, DATE '2022-01-15' + INTERVAL '10:00' HOUR TO MINUTE FROM DUAL UNION ALL
SELECT '00002', DATE '2022-01-15' + INTERVAL '09:30' HOUR TO MINUTE, DATE '2022-01-15' + INTERVAL '10:30' HOUR TO MINUTE FROM DUAL UNION ALL
SELECT '00002', DATE '2022-01-15' + INTERVAL '12:00' HOUR TO MINUTE, DATE '2022-01-15' + INTERVAL '18:00' HOUR TO MINUTE FROM DUAL;
Outputs:
ASSET_ID DAY AMT_UNAVAILABLE 00001 2022-01-01 .4166666666666666666666666666666666666667 00001 2022-01-02 1 00001 2022-01-03 .3333333333333333333333333333333333333333 00002 2022-01-15 .3333333333333333333333333333333333333333
db<>fiddle here
Upvotes: 1