Reputation: 205
I have a large table (ItemScannedTimes) that contains timestamp columns of scanned items. Each scan has an associated action_code, the actions codes are only allowed in a certain order when scanning. All Truck scans must be done before Delivery scans, and Pickup Scans last. (Truck > Delivery > Pickup). I am trying to set up a table based off this table that takes in the Max and Min scanned timestamps of each delivery stop. So i can find how long it took to service one stop and then the difference between the last stop Max scan to the current stop Min scan to find the "drive" time.
However the source data has issues where sometimes a Pickup scan is scanned as Delivery scan by accident, then the following scans are scanned as Pickups again.
So I am left with an issue like this:
action_code | scanned_timestamp | customer_address | route_name |
---|---|---|---|
Truck | 9:00 UTC | 123 | VN |
Truck | 9:01 UTC | 123 | VN |
Truck | 9:02 UTC | 123 | VN |
Delivery | 9:03 UTC | 123 | VN |
Truck | 9:04 UTC | 123 | VN |
Truck | 9:05 UTC | 123 | VN |
Delivery | 9:15 UTC | 123 | VN |
Delivery | 9:16 UTC | 154 | VN |
Delivery | 9:17 UTC | 375 | VN |
Where line 4 should of been "Truck" but was scanned as "Delivery" then the ones after were correct.
Technically I only need drive and service times for delivery and pick up records. As I wont count how long it takes to load the truck. So once I start using Lag() on records It will only need to start from the very last Truck record (Leaving the warehouse to the first delivery stop).
How can I get a sub-table from this where the whole table only start after the very last (highest scanned_timestamp) for Truck, but has all the records for Delivery and Pickup that comes AFTER that but not before. Without knowing beforehand the scanned_timestamps. Like this:
action_code | scanned_timestamp | customer_address | route_name |
---|---|---|---|
Truck | 9:05 UTC | 123 | VN |
Delivery | 9:15 UTC | 123 | VN |
Delivery | 9:16 UTC | 154 | VN |
Delivery | 9:17 UTC | 375 | VN |
Upvotes: 0
Views: 45
Reputation: 2565
Select * From ItemScannedTimes outer
Where scanned_timestamp >=
(Select max(scanned_timestamp)
From ItemScannedTimes inner
Where outer.customer_address=inner.customer_address
And outer.route_name=inner.route_name
And inner.action_code='Truck'
)
Upvotes: 0
Reputation: 393
WITH RankedScans AS (
SELECT
action_code,
scanned_timestamp,
customer_address,
route_name,
ROW_NUMBER() OVER (PARTITION BY action_code ORDER BY scanned_timestamp DESC) AS RowNum
FROM ItemScannedTimes
)
SELECT
action_code,
scanned_timestamp,
customer_address,
route_name
FROM RankedScans
WHERE
(action_code = 'Truck' AND RowNum = 1) OR
(scanned_timestamp >= (
SELECT MAX(scanned_timestamp)
FROM ItemScannedTimes
WHERE action_code = 'Truck'
))
ORDER BY scanned_timestamp;
Demo: https://dbfiddle.uk/mRIFEbqO
Upvotes: 0